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INTRODUCTION 


Research is the quest for knowledge or a systematic investigation in order to establish 
facts. It helps to solve problems and to increase knowledge. The basic aim of 
research is to discover, interpret and develop methods and systems to advance 
human knowledge on diverse scientific matters. There are different types of 
research, such as exploratory, descriptive and experimental. Exploratory research 
is done when few or no previous studies of the subject exist. Descriptive research 
is used to classify and identify the characteristics ofa subject. Experimental research 
suggests or explains why or how something happens. Thus, one of the primary 
aims of research is to explain new phenomena and generate new knowledge. Before 
conducting any research, a specific approach is to be decided; this is called research 
methodology. Research methodology refers to the way research can be conducted. 
It is also known as the process of collecting data for various research projects and 
helps to understand both the products as well as the process of scientific enquiry. 
A research process involves selection and formulation ofa research problem, 
research design, sample strategy or sample design, as well as the interpretation 
and preparation of research report. 


NOTES 


A few important factors in research methodology include the validity and 
reliability of research data and the level of ethics. Ajob is considered half done if 
the data analysis is conducted improperly. Formulation of appropriate research 
questions and sampling probable or non-probable factors are followed by 
measurement using survey and scaling techniques. A research design is a systematic 
plan for collecting and utilizing data so that the desired information can be obtained 
with sufficient accuracy. Therefore, research design is the means of obtaining reliable, 
authentic and generalized data. Research methodology is a very important function 
in today’s business environment. There are many new trends in research 
methodology through which an organization can function in this dynamic 
environment. 


This book, Research Methods, has been designed keeping in mind the self- 
instruction mode (SIM) format and follows a simple pattern, wherein each unit of 
the book begins with the Introduction followed by the Objectives for the topic. 
The content is then presented in a simple and easy-to-understand manner, and is 
interspersed with Check Your Progress questions to reinforce the student’s 
understanding of the topic. A list of Self-Assessment Questions and Exercises is 
also provided at the end of each unit. The Summary and Key Words further act as 
useful tools for students and are meant for effective recapitulation of the text. 


Self-Instructional 
Material 
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FUNDAMENTALS OF RESEARCH 
NOTES 


UNIT 1 MEANING, TYPES AND 
PROCESS OF RESEARCH 
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10 INTRODUCTION 


Broadly speaking, the search for knowledge is referred to as research. 
Research can also be defined as an art of scientific investigation. Within the 
academic scenario, research comprises defining and redefining problems, 
formulating hypothesis or suggested solutions; collecting, organising and 
evaluating data; making deductions and reaching conclusions; and in the end 
carefully testing the conclusions to determine whether they fit the formulating 
hypothesis. This unit will provide an overview of research. It will discuss the 
application and types of research. The process of research is also explained 
in this unit. 


1.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Define research 
e Describe the application of research in various fields 


e Explain the different types of research 
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1.2 DEFINITION AND MEANING OF BUSINESS 
RESEARCH 


A research generally begins with a question or a problem. The purpose 
of research is to find solutions through the application of systematic and 
scientific methods. 


Meaning and Definitions 


Research is a tool that is a building block and a sustaining pillar of every 
discipline—scientific or otherwise—that one knows of. Before comprehending 
the true meaning of the term, we would like to make it clear that this book 
primarily focuses on the process of business research. The premise of this 
decision-oriented enquiry is vast and may range from the simplistic view, 
which involves compilation and validation of information, to an exhaustive 
theory and model construction. To distinguish between non-scientific and 
scientific method, we would like to consider a few definitions of research. 


One of the earliest distinctions was made by Lundberg (1942) who 
stated ‘Scientific methods consist of systematic observation, classification, 
and interpretation of data. Now obviously, this process is one in which nearly 
all people engage in their daily life. The main difference between our day-to- 
day generalizations and the conclusions usually recognized as the scientific 
method lies in the degree of formality, rigorousness, verifiability, and general 
validity of the latter.’ 


Fred Kerlinger (1986) also validated the thought and stated that 
‘Scientific research is a systematic, controlled and critical investigation of 
propositions about various phenomena.’ Grinnell (1993) has simplified the 
debate and stated ‘The word research is composed of two syllables, re and 
search. The dictionary defines the former as a prefix meaning again, anew or 
over again and the latter as a verb meaning to examine closely and carefully, 
to test and try, or to probe. Together they form a noun describing a careful, 
systematic, patient study and investigation in some field of knowledge, 
undertaken to establish facts or principles.’ 


Thus, drawing from the common threads of the above definitions, we 
derive that management research is an unbiased, structured, and sequential 
method of enquiry, directed towards a clear implicit or explicit business 
objective. This enquiry might lead to validating existing postulates or arriving 
at new theories and models. 


The most important and difficult task of a researcher is to be as objective 
and neutral as possible. The temptation to skew the results in the hypothesized 
direction has to be avoided at all costs. Magazine articles and newspaper 
surveys which want to prove a point might want to skew the opinion polls 
in favour of the Capitalists or the Republicans, or on the need for reservation 


versus no reservation in educational institutes but a researcher has to collect 
and display the findings of the research as objectively as possible. 


The last most important aspect of our definition that needs to be 
carefully considered is the decision-assisting nature of business research. 
Thus, as Easterby-Smith et al. (2002) state, business research must have some 
practical consequences, either immediately, when it is conducted for solving 
an immediate business problem or when the theory or model developed can be 
implemented and tested in a business setting. The world of business demands 
that managers and researchers work towards a goal—whether immediate or 
futuristic, else the research loses its significance in the field of management. 


Some of the proposed definitions of research are as follows: 


e Redman and Mory have defined research as a systematized effort 
to gain knowledge. 


e In the words of renowned researcher Clifford Woody, research 
involves defining and redefining problems; formulating suggested 
solutions or hypotheses; collecting, evaluating and organizing data, 
reaching conclusions and making deductions and carefully testing the 
conclusions to find out if they fit the formulating hypothesis or not. 


e D. Slesinger and M. Stephenson in the Encyclopaedia of Social 
Sciences define research as the manipulation of things, concepts or 
symbols for the purpose of generalizing to extend, correct or verify 
knowledge, whether that knowledge aids in the construction of a 
theory or in the practice of an art. 


Purpose of Research 

The principal purpose of research is to find solutions to problems 

systematically. In general, the purpose of research can be specified as follows: 
e To acquire familiarity with a phenomenon 


e To study the frequency of connection or independence of any activity 
or occurrence 


e To determine the characteristics of an individual or a group of activities 
and the frequency of the occurrence of these activities 


e To test a hypothesis about a causal relationship that exists between 
variables 


Characteristics of Research 


The process of research helps increase the creative ability of a decision-maker. 
The various characteristics of research are as follows: 


e Interdisciplinary Team Approach: This approach is based on the 
principle of using the expertise and experience of different personnel 
working in different disciplines within an organization. An individual 
cannot be an expert in all the areas of operation. So, researchers take 
help from other experts, who are specialists in their respective fields. 
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Under interdisciplinary team approach, an expert may use old solutions, 
which are used in the past as a research material for finding the most 
appropriate solution to the problem. 


Methodological Process: Researchers use scientific methods and 
techniques to provide an optimum solution to the problems. The 
scientific methods include observing and defining a problem and 
formulating a hypothesis related to the results of the scientific methods 
and techniques. If the hypothesis is accepted, then its results should 
be executed in an organization, but if the hypothesis is not accepted, 
then another hypothesis is formulated. 


Objectivistic Approach: The aim of an organization is to provide 
optimal solutions to various problems. It is essential to measure the 
desirability of a solution for achieving the organizational objective. 
This measured desirability helps in comparing the alternative courses 
of action with respect to their outcomes. 


Economical in Nature: In an uncertain and complex situation, research 
helps in reducing the costs of inventory and thereby, improving the 
profits. For example, in inventory control, research can provide 
scientific rules for reducing acquisition costs and inventory-carrying 
costs. 


Nature of Research 


Good and effective research is identified by its nature, which signifies its focus 
on the research topic, a systematic way of implementation, control over the 
variables and so on. The nature of a good and effective research is as follows: 


Objectivity: A good research is objective in terms of offering solutions 
to the research questions. This calls for planning and creation of suitable 
hypothesis to avoid lack of relationship between the research questions 
and hypothesis. 


Control: A good research is capable of controlling all the variables. 
This necessitates randomization at all stages and ascertains sufficient 
control over the independent variables. 


Universality: A good research will almost have the same result by using 
identical methodology so that the result can be applied to similar situations. 


Free from personal biases: A good research is free from the 
researcher’s personal biases and must be based on objectivity and not 
subjectivity. 

Systematic: A good research has several well-planned steps that are 
inter-connected and logical. 


Reproductivity: A researcher, while conducting the research, is 
able to obtain approximately the same results by using an identical 
methodology for conducting investigation. 


1.2.1 Applications of Business Research 


The discussion so far points out the role and significance of research in 
aiding business decisions. The question one might ask here is about the 
critical importance of research in different areas of management. Is it most 
relevant in marketing? Do financial and production decisions really need 
research assistance? Does the method or process of research change with 
the functional area? 


The answer to all the above questions is NO. Business managers in 
each field—whether human resources or production, marketing or finance— 
are constantly being confronted by problem situations that require effective 
and actionable decision making. Most of these decisions require additional 
information or information evaluation, which can be best addressed by 
research. While the nature of the decision problem might be singularly 
unique to the manager, organization and situation, broadly for the sake of 
understanding, it is possible to categorize them under different heads. 


Management Dilemma 
Basic vs Applied 


Defining the Research Problem 


Formulating the Research Hypothesis 


| IH 


Developing the Research Proposal 


¥ 


The Research Framework 
Research Design 


Data Collection Plan Sampling Plan 


Instrument Design 


Pilot Testing 


HH 


Data Collection 


Data Refining and Preparation 


Data Analysis and Interpretation 


Research Reporting 


Management/Research Decision 


| HH 


Fig 1.1 The Process of Research 
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Marketing function 


This is one area of business where research is the lifeline and is carried out 
on a vast array of topics and is conducted both in-house by the organization 
itself and outsourced to external agencies. Broader industry- or product- 
category-specific studies are also carried out by market research agencies and 
sold as reports for assisting in business decisions. Studies like these could be: 


e Market potential analysis; market segmentation analysis and demand 
estimation 


e Market structure analysis which includes market size, players and 
market share of the key players 


e Sales and retail audits of product categories by players and regions 
as well as national sales; consumer and business trend analysis— 
sometimes including short/long term forecasting 


However, it is to be understood that the above mentioned areas need 
not always be outsourced; sometimes they might be handled by a dedicated 
research or new product development department in the organizations. Other 
than these, an organization also carries out researches related to all four Ps 
of marketing such as: 


e Product research: This would include new product research; product 
testing and development; product differentiation and positioning; 
testing and evaluating new products and packaging research; brand 
research—including equity to tracks and imaging studies. 


Pricing research: It includes Price determination research; evaluating 
customer value; competitor pricing strategies; alternative pricing 
models and implications. 


Promotional Research: It includes everything from designing of 
the communication mix to design of advertisements, copy testing, 
measuring the impact of alternative media vehicles, impact of 
competitors’ strategy. 


Place research: It includes locational analysis, design and planning of 
distribution channels and measuring the effectiveness of the distribution 
network. 


These days, with the onset of increased competition and the need 
to convert customers into committed customers, customer relationship 
management (CRM), customer satisfaction, loyalty studies and lead user 
analysis are also areas in which significant research is being carried out. 


Personnel and Human Resource Management 


Human resources (HR) and organizational behaviour is an area which involves 
basic or fundamental research as a lot of academic, macro level research 
may be adapted and implemented by organizations into their policies and 
programmes. Applied HR research by contrast is more predictive and solution 


oriented. Though there are a number of academic and organizational areas in 
which research is conducted, yet some key contemporary areas which seem 
to attract more research are as follows: 


e Performance management; leadership analysis development and 
evaluation; organizational climate and work environment studies; 
talent and aptitude analysis and management; organizational change 
implementation, management and effectiveness analysis 


e Employee selection and staffing: This includes pre and on-the-job 
employee assessment and analysis; staffing studies 


e Organizational planning and development: Culture assessment— 
either organization specific or the study of individual and merged 
culture analysis for mergers and acquisitions; manpower planning and 
development 


e Incentive and benefit studies: These include job analysis and 
performance appraisal studies; recognition and reward studies, 
hierarchical compensation analysis; employee benefits and reward 
analysis, both within the organization and industry best practices 


e Training and development: These include training need gap analysis; 
training development modules; monitoring and assessing impact and 
effectiveness of training 


e Other areas include employee relationship analysis; labour studies; 
negotiation and wage settlement studies; absenteeism and accident 
analysis; turnover and attrition studies and work-life balance analysis 


Critical success factor analysis and employer branding are some 
emerging areas in which HR research is being carried out. The first is a 
participative form of management technique, developed by Rockart (1981) in 
which the employees of an organization identify their critical success factors 
and help in customizing and incorporating them in developing the mission 
and vision of their organization. The idea is that a synchronized objective 
will benefit both the individual and the organization, and which will lead to a 
commitment and ownership on the part of the employees. Employer branding 
is another area which is being actively investigated as the customer perception 
(in this case it is the internal customer, i.e., the employee) about the employer 
or the employing organization has a strong and direct impact on his intentions 
to stay or leave. Thus, this is a subjective qualitative construct which can 
have hazardous effect on organizational effectiveness and efficiency. 


Financial and Accounting Research 


The area of financial and accounting research is so vast that it is difficult to 
provide a pen sketch of the research areas. In this section, we are providing 
just a brief overview of some research topics: 
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e Asset pricing, corporate finance and capital markets: The focus here 
is on stock market response to corporate actions (IPOs, takeovers and 
mergers), financial reporting (earnings and firm specific announcements) 
and the impact of factors on returns, e.g., liquidity and volume; 


Financial derivatives and interest rate and credit risk modelling: This 
includes analysing interest rate derivatives, development and validation 
of corporate credit rating models and associated derivatives; analysing 
corporate decision making and investment risk appraisal; 


Market based accounting research: Analysis of corporate financial 
reporting behaviour; accounting-based valuations; evaluation and usage 
of accounting information by investors and evaluation of management 
compensation schemes; 


e Auditing and accountability: This includes both private and public 
sector accounting studies, analysis of audit regulations; analysis of 
different audit methodologies; governance and accountability of audit 
committees; 


Financial econometrics: This includes modelling and forecasting in 
volatility, risk estimation and analysis; 


Other related areas of investigation are in merchant banking and 
insurance sector and business policy and economics areas. 


Considering the nature of the decision required in this area, the research 
is a mix of historical and empirical research. Behavioural finance is a new 
and contemporary area in which, probably, for the first time subjective and 
perceptual variables are being studied for their predictive value in determining 
consumer sentiments. 


Production and Operation Management 


This area of management is one in which quantifiable implementation of 
the research results takes on huge cost and process implications. Research 
in this area is highly focused and problem specific. The decision areas in 
which research studies are carried out are as follows: 


e Operation planning: These include product/service design and 
development; resource allocation and capacity planning 


e Demand forecasting and decision analysis 


e Process planning: Production scheduling and material requirement 
management; work design planning and monitoring 


e Project management and maintenance management studies 
e Logistics and supply chain and inventory management analysis 


e Quality estimation and assurance studies: These include total quality 
management (TQM) and quality certification analysis 


This area of management also invites academic research which might 
be macro and general but helps in developing technologies such as JIT 
(just-in-time technology) and EOQ (economy order quantity—an inventory 
management model) which are then adapted by organizations for optimizing 
operations. 


Cross-functional research 


Business management being an integrated amalgamation of all these and 
other areas sometimes requires a unified thought and approach to research. 
These studies require an open orientation where experts from across the 
disciplines contribute to and gain from the study. For example, an area such 
as new product development requires the commitment of the marketing, 
production and consumer insights team to exploit new opportunities. Other 
areas requiring cross functional efforts are: 


e Corporate governance and ethics—the role of social values and ethics 
and their integration into a company’s working is an area that is of 
critical significance to any organization. 


e Technical support systems, enterprise resource planning systems, 
knowledge management, and data mining and warehousing are 
integrated areas requiring research on managing coordinated efforts 
across divisions 


e Ecological and environmental analysis; legal analysis of managerial 
actions; human rights and discrimination studies 


Check Your Progress 


1. What is the principal purpose of research? 


2. What are some of the areas which involve basic or fundamental 
research? 


1.3 TYPES OF RESEARCH 


The types of research depend on the field in which the specific research study 
is performed. The different types of research are as follows: 


e Pure research: This type of research is mainly concerned with 
identifying certain important principles in a specific field. It intends to 
find out information that has a broad base of application. Examples of 
fundamental research are John Robinson’s imperfect competition theory 
in Economics and Maslow’s hierarchy of needs theory in motivation 
and so on. 


e Applied research: This type of research aims at finding a solution to 
an immediate problem, faced by a society or an industrial organization. 
It is supposed to discover a solution to some basic practical problems. 
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Applied research suggests corrective methods to minimize a social or 
business problem. 


Historical research: It is the process of systematic examination of 
past events to be able to present an account of what has happened in 
the past. Most people believe that it is a simply accumulation of dates 
and facts of past events. Actually, it is not so. It is a flowing, dynamic 
account of past events involving an interpretation of these events to 
understand the factors, personalities and ideas that influenced these 
events. One of the goals of historical research is to communicate an 
understanding of past events. 


Historical research is important because it: 
(i) Helps uncover the unknown or unrecorded facts/events 
(ii) Answers questions 


(iii) Facilitates identification of the relationship of the past has with 
the present 


(iv) Records and evaluates the accomplishments of individuals, 
agencies, or institutions. 


(v) Assists in understanding the culture in which we live 


It is not possible to identify any single approach that is used in 
conducting historical research. Some general steps that are typically 
followed, which include the following: 


(i) Identifying the research topic 

(ii) Formulating the research problem 
(iii) Collecting data and reviewing literature 
(iv) Evaluating materials. 

(v) Synthesizing data 
(vi) Preparing report 
Futuristic research: This type of research does not really predict 
the future as the name may seem to suggest. It is actually a branch of 
operations research aimed at conducting long-range planning based on 
forecasting using mathematical models, cross-disciplinary treatment 


of the subject, systematic use of expert judgements or opinions and a 
systems analytical approach. 


Analytical research: Analytical research is concerned with cause- 
effect relationships. For example, examining the fluctuations in India’s 
international trade during a particular period of time would require 
descriptive research. However, to explain why and how Indian trade 
balance moves in a specific way over a period of time is an example 
of analytical research. 


Synthetic research: Synthetic research deals with basic mechanisms 
or relationships of the different mechanisms within the entire organism. 


This type of investigation depends neither on the methods used nor on 
the subject of investigations. 


Descriptive research: Descriptive research attempts to determine, 
describe, or identify what is unlike analytical research aimed at 
establishing why it is that way or how it came to be that way. 
Descriptive research uses descriptions, classifications, measurements, 
and comparisons to describe various phenomena. 


Prescriptive research: This type of research encompasses both the 
descriptive and explicative dimensions of research. The descriptive 
level aims to describe the research object whereas the explicative level 
interprets the observed phenomena. Prescriptive research tests the 
relevance of both explicative and prescriptive hypotheses. It supposes 
an interaction between the researcher and the field of study so as to 
implement recommendations or propositions and then measure their 
impact. 

Survey research: It is a collection of quantified data from a section 
of the population for describing or identifying between variables that 
may point to causal relationships or predictive patterns of influence. A 
census can be considered a survey that measures the nature of human 
resource available, their level of education, their professions, etc. 


Experimental: The experimental type of research enables you to 
calculate the findings, employ the statistical and mathematical devices 
and measure the results thus quantified. 


Case study: This method undertakes intensive research that requires a 
thorough study of a particular unit; for example, industrial or banking, 
for data collection. 


Generic research: It is research exists between applied research and 
basic research. It is considered to be a less academic way of researching 
used when research has to be conducted within a short frame of time. 
It results in a qualitative description showing how participants not 
only understand but also make sense of their experiences. The very 
lack of a specific method is the advantage of this research. Instead of 
being obsessed with one particular method, researchers use various 
procedures and strategies of data collection without being bound by 
too many technical and philosophical issues. 


Formulative or exploratory: It helps examine a problem with suitable 
hypothesis. This research, on social science, is mainly significant 
for clarifying concepts and innovations for further researches. The 
researchers are mainly concerned with the principles of developing 
hypothesis and testing with statistical tools. 


Ex post facto: This type of research is the same as experimental 
research, which is conducted to deal with the situations that occur in or 
around an organization. Examples of such a research are market failure 
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of an organization’s product being researched later and research into 
the causes for a landslide in the country. 


Disciplinary research: This research aimed at improving a discipline. It 
is concerned with the theories, relationships, and analytical procedures 
and techniques within the discipline, for example, economic research 
or social research. 


Subject-matter research: This is research on any subject of interest 
within a specific discipline. 


Problem-Solving research: It is research designed to solve a specific 
problem for a specific decision-maker. It is often multidisciplinary, for 
example, a multidisciplinary study of a new drug for cancer involving 
medical doctors, engineers, and an economist. 


Besides these, there are several other types of research like evaluation 
research, survey research, assessment research and comparative research. 


The researchers in quantitative research classify features of the 
research and then build the statistical models to explain what he observes. 
The researcher in quantitative research must know clearly in advance what 
he wants to research before he starts the research. Focus in this research 
is concise and narrow. Thus, quantitative research is conclusive. The data 
collected in this research is measurable and can be analysed easily by the 
researcher. Unlike qualitative research, quantitative research deals with what, 
where and when to research. This type of research is used in later phases of 
research cycle. Quantitative research is basically the study of numbers and 
statistics. The researcher can use different types of tools like questionnaires 
or equipment to gather this numerical information. The data collected in 
quantitative research is more effective than qualitative data and can help 
the researcher more. This type of research is not concerned with process. It 
only deals with what will be the outcome or product. Thus, the researcher 
in quantitative research tends to be objective; therefore, it is also called 
objective research. 


It should be kept in mind that it is difficult to categorise a particular 
research under any major heads. This is because, no matter what the nature 
or method of research, the research problem is essentially treated in an 
interdisciplinary manner. Interdisciplinary treatment means the borrowing 
of an idea from related disciplines connected with the research topic for 
more authenticity. For example; management is not an individual discipline 
in its own right and requires an integral approach of various disciplines like 
finance and human resources, etc. 


Check Your Progress 


3. What is pure research concerned with? 
4. List the names of any five types of research? 


5. What is quantitative research? 
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QUESTIONS 
1. The principal purpose of research is to find solutions to problems NOTES 
systematically. 


2. Human resources (HR) and organizational behaviour are areas which 
involve basic or fundamental research as a lot of academic, macro level 
research may be adapted and implemented by organizations into their 
policies and programmes. 

3. Pure research is mainly concerned with identifying certain important 
principles in a specific field. 

4. The types of research depend on the field in which the specific research 
study is performed. The five types of research are: 


(a) Pure research 

(b) Applied research 

(c) Historical research 
(d) Analytical research 
(e) Descriptive research 


5. Quantitative research is basically the study of numbers and statistics. 
The researchers in quantitative research classify features of the research 
and then build the statistical models to explain what he observes. The 
researcher in quantitative research must know clearly in advance what 
he wants to research before he starts the research. Focus in this research 
is concise and narrow. Thus, quantitative research is conclusive. 


15 SUMMARY 


e The purpose of research is to find solutions through the application of 
systematic and scientific methods. 


The most important and difficult task of a researcher is to be as objective 
and neutral as possible. 


Good and effective research is identified by its nature, which signifies 
its focus on the research topic, a systematic way of implementation, 
control over the variables and so on. 


Business managers in each field—whether human resources or 
production, marketing or finance—are constantly being confronted 
by problem situations that require effective and actionable decision 
making. Most of these decisions require additional information or 
information evaluation, which can be best addressed by research. 


The types of research depend on the field in which the specific research 
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e The researchers in quantitative research classify features of the research 
and then build the statistical models to explain what he observes. 
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e Fundamental Research: It focuses on finding generalizations and 
formulating theories. 


e Applied Research: It aims at finding a solution for an immediate 
problem facing a society or a business/industrial organization. 


e Total Quality Management: It refers to a system of management 
based on the principle that every member of staff must be committed 
to maintaining high standards of work in every aspect of a company’s 
operations. 


e Social Research: This refers to research conducted by social scientists 
in order to analyse a vast breadth of social phenomena. 


e Quantifiable: It means something that is able to be expressed or 
measured as a quantity. 


1.7 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short Answer Questions 


1. Distinguish between non-scientific and scientific method. 
2. What are the characteristics of research? 
3. Which type of research in concerned with cause-effect relationships? 


Long Answer Questions 


1. Describe the application of business research in various fields. 
2. Explain the different types of research. 
3. Discuss financial and accounting research in detail. 
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2.0 INTRODUCTION 


In the previous unit, you were introduced to various aspects of business 
research. Here we will discuss the advancements made in research in recent 
years. Business research, like all other domains, has been greatly influenced 
by technological advancements in recent years. Research that is carried out 
today bears no resemblance to business research of earlier times. One of the 
notable advances has been the advent of the Internet. We will discuss this 
aspect in detail. The unit will also discuss the differences between different 
types of research. 


2.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Describe online research and its data collection methods 


e Discuss the differences between various types of research 


2.2 RECENT ADVANCEMENTS IN RESEARCH: 
ONLINE RESEARCH 


If the 1960s was the era of rationality and the search for universal paradigms 
and absolute truths which could stand the test of time and boundaries; the 
1990s saw turmoil and uncertainty. As the aftermath of nuclear warfare 
and environmental calamities like pollution, global warming and genetic 
malformations led to post-modernism and a questioning mindset characterized , 
i g . . i . Self-Instructional 
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Different Types of Research where more and more people across the world sought a world that was surreal 
ia ia and thus free from the chaos and disappointments as well as threats of the real 
world. The need was ably supported by the extremely fast digital growth that 
was happening across the world. Today, almost two decades later, more than 
one million people across physical boundaries stand connected through online 
communities, networks, groups, forums and podcasts. The huge success of 
virtual social worlds such as Second life is a definite proof of the fact that 
more and more consumers are taking on an alternative identity (or avatar), 
which has no constraints or rules. This is only one part of it—the success 
of social communities (Facebook), virtual product sales (on forums such as 
Flipkart and Snapdeal) gaming (World of Warfare) and knowledge/opinion 
sharing (Twitter and Wikipedia) all point towards the relevance of seeking 
time and information from data sources that are available (secondary) and 
can be sought (primary) in a virtual environment. 


NOTES 


The Relevance and Domain of Online Research 


In the last decade, what we saw was the recognition of the Internet as a useful 
source of secondary information, such as databases and online resources. 
However, today it is being recognized as a separate method as it involves 
unique challenges and processes related to sampling, data collection and 
measurement metrics which are not prevalent in traditional research as we 
know it. Thus, it is critical to understand these issues from the perspective 
of using the medium effectively for conducting a research study. 


A typical phenomenon of virtual space is that companies now have to 
face the true aspects of designing consumer centric strategies. Thus, for the 
new era of co-creation by consumers and business managers, the business 
researcher needs to be “listening” to what the brand communities are saying; 
“talking” with them for co creation; “energizing” and “supporting” to 
complete the engagement with the consumer. The medium is exciting and has 
huge potential, yet it is in an evolving stage as it faces constant challenges of 
changes in terms of business-customer interface as well as ethical constraints. 
Thus, both perspectives on recognizing the value of the process as well as 
serious concerns exist about it. Thus, before we go on to the specifics of the 
online research process, let us briefly examines the pros and cons of using 
the method. 


Advantages and Disadvantages of Online Research 


Just like the traditional research process we have gone through in the textbook 
this also has strengths and weaknesses associated with it. Some of these are 
listed below: 


Advantages 


The advantages are as follows: 


e Low cost: The most supportive argument is the cost of conducting the 


Self-Instructional online research. Researchers have found it to be almost 30% cheaper to 
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conduct a study online. The only significant cost the investigator may Piferent Types of Research 
incur is in the use of the software to generate the study questionnaire. E 
This has also been resolved to a certain extent as a number of free 
sites are available that can be used for designing and uploading the 
instrument. The second is the saving in the negligible to zero cost of 
reaching the sample respondents. 


NOTES 


Quick response time: This is both in terms of secondary data as well 
as collecting data that is primary in nature from the sample group. 


Better respondent engagement: With the innovation in design and 
tools available on the net the questionnaire and the information seeking 
can be made very engaging and interesting for the respondent. 


e Extensive reach: The advantage of the virtual medium is that there 
are no distances in terms of approaching the sample group. Also, with 
advanced software available it is possible to enable an almost instant 
translation of the questions into the language of the respondent. 


Anonymity and answering: Since the researcher/investigator is in 
most instances not there, the respondent feels freer to answer and the 
relative anonymity gives them the assurance to answer, sensitive and 
open ended questions 


Accuracy in data entry: Since the response categories for the closed 
ended questions is done in the beginning there is no likelihood of human 
error in filling the answers in the spread sheet. The other records in 
terms of time off access and time taken to complete the questionnaire, 
etc., are precisely recorded and again this ensures zero error. 


Authentic data sources: With more and more companies and research 
agencies realizing the merit of the medium, reputed companies like 
Nielsen, Forrester and Euromonitor are establishing online divisions 
to cater to the needs of the business and academic researcher. 


Disadvantages 


The disadvantages are as follows: 


e Skewed sample: The constraint of the method is that the data collected, 
especially primary, can only be conducted on people who are Internet- 
savvy. Thus, there is the issue of generalizability. 


e Representativeness and authenticity: The anonymity of the 
respondent is also a problem as one does not know who is on the other 
side as the person might not reveal his/her true identity, age or gender. 
Thus, one may conduct and formulate conclusion based on a sample 
group that was not matching the population under study. 


e Significant cues: A lot of physical cues that come from body language 
and voice modulations is lost in an online survey. Though this issue is 
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Different Types of Research analysis of emoticons (smiley face and punctuation and word forms) 
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in the text is being researched to try and overcome this weakness. 


Malicious responses: Once the questionnaire is posted for response one 
has no control over who responds. It might happen that a disgruntled 
employee or customer might be extremely negative and fill the 
questionnaire not once but multiple times and thus deform the output. 


NOTES 


Design problems: The online surveys are more engaging provided one 
knows how to make effective use of the software features. Thus, they 
are also difficult to design and the average online researcher might not 
be proficient in doing so. 


The online research process is by and large the same in terms of steps 
involved. However, special mention needs to be made of three important 
issues-sampling; data collection and data metrics. 


Sampling For Online Research Studies 


One of the major challenges in online studies is designing an effective 
sampling plan and obtaining a representative sample. Since no concrete 
sampling frame exists of internet users, obtaining a probability sample is a 
difficult task. As a result of non-representativeness in sampling the sampling 
error becomes considerable and thus raising doubts with reference to the 
results of the study. In case the research study is being conducted on a finite 
group as amongst employees in a company or even students in universities, 
the population is finite and thus chances of error are minimized. Hence in 
the absence of sampling frame one should disperse the questionnaire on all 
relevant platforms, mailing lists, chat room, news group etc. However, there is 
still no way of knowing whether the sample who responded is representative 
of the population one wanted to study. 


Added to the challenge is the fact that the same user may have multiple 
accounts. And updating and comprehending the accounts on which he is 
active/inactive is difficult to obtain. To a certain extent there are various 
companies across the Globe that have recognized the web-opportunity in the 
gap and provide the service of sampling users directly from various websites. 
Netzero is one such free Internet service provider. The company has a barter 
strategy and in exchange of complete profiling and tracking rights of user’s 
site behaviour, it offers the use of free internet access. Despite the invasion 
of privacy, the company has more than 8 million users. Thus the firm has a 
data base of consumers and can to a certain extent assist in improving the 
representative nature of the sample and also based on the profile of consumers 
manage an experimental design of experimental and control group, better. 


Another company utilizing this barter strategy is Knowledge Networks. 
This company uses RDD (random digit dialling) methods to recruit individuals 
for a household panel survey. This would need to be longitudinal in nature. 
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The recruited and screened panellist is provided free Web TV receiver and 
internet access in exchange for agreeing to participate in the online panels/ 
surveys. 


There are some typical ways of sampling on the net. 


Open-—Internet samples: This sample includes people who, for whatever 
reason volunteered to complete the online questionnaire survey. Some also 
opt for being part of online panels. This method suffers from the problem 
of self-selection. The second problem is that if the survey is too long they 
might get bore or lose interest and quit without completing the survey. Also, 
these are sometimes mailed and sometimes they might be rolled out as pop- 
up surveys. The challenge with executing pop-up surveys, being that most 
Internet users these days have a pop-up blocker. Sometimes, the researcher 
also does Internet—intercept survey, which involves interjecting into an 
Internet user’s activity on a typical homepage of any site. 


Screened—Internet samples: This screened sample could be from the open- 
sample group or they might be part of a particular data base or service provider 
like Net zero. They are first administered a screening questionnaire and then 
requested based on the study requirement to complete the survey. Sometimes 
using the screener it is also possible to classify them into separate segments. 
In this case it is possible to direct them towards separate questions based on 
their characteristics. For example in a study on compensation and rewards, 
there might be groups of Public sector workers as well as private, so they 
are directed towards different sections. 


Recruited sample: These are members who are generally accessed like 
the traditional method that is once they are representative of the population 
under study they are contacted through mail, email, telephone or in person. 
And after they agree to answer the survey they are sent the questionnaire or 
the link to the questionnaire, with a password to complete it. 


Data Collection Methods for Online Research 


As is the case with traditional research process, online research also has the 
same basic two broad categories of data collection—primary and secondary. 


I. Secondary Methods 
Let us discuss the secondary data collection methods in detail. 
Search engines 


Today, one of the most powerful and most frequently used sources of 
secondary data is the Internet. A number of companies like Google, Wikipedia, 
MSN search, and Yahoo search have recognized the merit of having a 
full-fledged division dedicated to this. The search engines have their own 
programmed web crawlers, web spiders (these are like web robots and they 
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Different Types of Research systematically “crawl” the Internet to search and index sites/information) of 
Paneer taking the “searcher” to various sites. Some popular methods are based on 
keywords and their density, after which they look at the link popularity—in 
terms of how many times it has been accessed—and today with monetization 
NOTES of sites, how much does one need to pay per click. There are again general 
search engines like Google and Yahoo and more specific in terms of, say, 
when you are looking for specifics in terms of let’s say statistical data related 
to Indian demographics, one goes to www.censusindia.gov.in. Because of 
the huge number of websites available with a single key term one may get 
1000 or 10000 options and it is near impossible to tackle all of them, the 
other challenge is that a lot of sites , especially scholarly search sites like 
www.hbsp.harvard.edu (Harvard Business School publishing) require a 
password and cannot be accessed normally. Thus, the researchers may like to 
move to focused and reliable sites like Pathfinders. Pathfinders are basically 
sites that take the user to a limited portfolio of sites that are provided by 
credible sources. www.pathfinderhealth.in is a pathfinder that is focused 
on informational sites related to health and relevant to the Indian user/ 
practitioner. These sites have what are known as intelligent crawlers that 
index specific topic-related results. 


Newsgroups 


These are quite similar to other social media platforms. They are called 
newsgroups because they are a primary method of communication in a 
virtual world with like minded professionals (e.g. marketing academicians— 
www.marketingpower.com) or special interest groups (e.g. management 
aspirants— www.pagalguy.com). The “Internet reader” can view threads 
(conversation histories); pose questions to other group members or rebuke 
or disagree with points of views more or less as in a face-to-face argument. A 
typical newsgroup message looks very similar to an email. There is a sender, 
a subject title and the actual message. These threads are powerful sources 
of information as you as a researcher can browse through an entire thread 
and get a first hand qualitative insight into what the respondent population 
is thinking and doing. 


Blogs 


Blogs originated in the late 1990s when they were usually managed by 
an enthusiast who gave a chronological index of sites of interest and also 
provided a personal commentary on the links or sites. However, later people 
created their own private blogs, which were like public sharing of private, 
personal views and thoughts. The fact that they are in the public domain 
means they are accessible and sometimes ones expression of discontent or 
despair that reflects a personal misery creates a reaction and sometimes can 
lead to an uprising, as can be seen in a number of cases of rebellion in the 
years 2011—12. Marketing researchers find blogs as very interesting as they 
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are able to understand the lifestyle and beliefs about any consumer segment —Pifferent Types of Research 
rather than merely the product or the brand, thus making targeting and iad 
positioning strategies more focused and meaningful. In fact there are search 
engines like www.blogsearch.com that can help a researcher conduct a blog 


search on any topic of interest. NOTES 


II. Primary Methods 


The premise of using the primary methods and the basic nuances of the 
techniques remain the same. In this section, we will highlight the aspects that 
are different and thus need to be taken care of while making use of any of 
these. There are also some primary methods—netnography—that are unique 
to this medium and will be dealt with in the end in some detail. 


Before we proceed further, let us examine some categorization of 
online primary methods. One is between a web-based method in which the 
researcher could make use of a web designed questionnaire and collect the 
data from the respondent. The other is a communication method, which is 
more personalized and targeted towards collecting specific information from 
identified sample group. This involves using the email as a personalized 
platform for collecting information. 


The other method is synchronized vs non-synchronized. In the first the 
researcher/interviewer asks questions and the respondent answers in real time 
while in the second case the questionnaire is sent to the respondent and he/ 
she answers as per her convenience at a later time slot. 


Online focus groups 


The focus group is as rich in its conduct and usefulness as it is the real world. 
Here the focus croup could be both in the form of chat or discussion foruams— 
where the group members are already familiar with each other or else they 
are selected through the Internet. The method could also be synchronized 
where all members and the moderator are discussing at a single moment in 
time. However, there could also be non-synchronized focus groups where 
the members might post their comments and then move out of the group to 
conduct other activities and someone else may respond much later and the 
user when he returns then responds to the comment. These are typically 
called bulletin boards. 


As the method involves usually typing ones response it is recommended 
that since there could be simultaneous response from the group members 
rather than 8-10 as is the practice in a regular focus group one should limit 
oneself to 6-8 members. Secondly, the moderator must be fast in typing on 
the keyboard and be very familiar with handling diversions and interjections 
on the software platform. A typical online focus group last for about two 
hours. While some group members might be keying in their response others 


might react with emoticons like smileys, etc., to express their feelings for the 
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at the same time, it might be prudent to use two moderators so that multiple 
reactions can be handled at the same time. Just like the traditional method 
the online methods have their own challenges and advantages. The advantage 
primarily being in terms of cost, geographic reach and to a certain extent 
they do not involve facing a group. The disadvantages are that the richness 
of non-verbal cues are lost here. 


NOTES 


Social network analysis 


This method has its origin in Sociometry. Here, essentially one tries to study 
social or virtual social ties. This involves studying the structure—hierarchy 
and patterns of networks that emerge between social or virtual actors. There 
are essentially two aspects one is analyzing—nodes (the net users) and the 
ties (their relationship with each other). The ties could be a sharing of ideas, 
information a business transaction or an emotional transaction. One can 
do things in a social network—either observe the way the information is 
flowing—in terms of who is the centre of the network(opinion leader), who 
is the loner, are their two people who communicate more with each other 
(dyads). The second method is to ask questions and find out with whom 
the group members would interact for emotional problems or information/ 
knowledge seeking. Basically, the idea being to assess how decisions are 
taken in group settings and how group dynamics influence individual or 
group behaviour in a particular network. 


Online surveys 


The online survey may be conducted in both real time and non-synchronized. 
The survey could involve either of the following two methods: 


e E-mail-based surveys: These are generally conducted after the 
sampling has been done and the email address of the respondent 
has been made available. Post which the study instrument may be 
attached with the mail or be embedded in the mail. in this case there 
would be a short introduction to the study and the respondent answers 
the questions and then carry out the simple action of reply , the filled 
questionnaire returns back to the researcher. The other method is that 
there is an attachment which needs to be downloaded and then filled 
in. This can be either sent back as an attachment or the physical copy 
can be mailed back to the researcher. 


Web-based surveys: These involve using software or a program to 
generate a questionnaire. This method has a huge advantage in terms 
of design capabilities. One can make the questionnaire engaging and 
interesting by making use of computer programs. Secondly, the option 
of filter and branching question that are tedious when done in the 


traditional manner are handled very efficiently here. In most instances 
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the instrument requires the respondent to punch/key in the button Diferent Types of Research 
indicating their response. There are multiple web survey packages EE 
available today that can help the researcher to efficiently design a web 
survey, e.g. Web surveyor; Perseus Survey Monkey; Zoomerang, etc. 
The software further segregate and categorize data by tabulating the 
responses. Thus the task of making a data entry and coding the data 
is saved as the human error in data entry is eliminated here. The basic 
challenge lies not in designing but in getting the respondent to the 
instrument and motivating them to complete the survey. 


NOTES 


Netnography 


Robert Z. Kozinets (2010) came up with an online method that has its 
roots in ethnographic analysis. Ethnography is basically an anthropological 
technique used quite actively today in the field of marketing and consumer 
research today. The method is distinguished from other primary methods as 
it uses multiple methods in conjunction with each other to arrive at a rich and 
holistic picture about a culture or a community. The methods popularly used 
are the observation method, semiotics, films, documentaries, conversational 
and discourse analysis, videography. The idea being to use every possible 
piece of communication/information that has been spouted/created by the 
user of that community to understand the apparent and latent aspects about 
the community. 


Kozinet took the participant-observation method to understand 
discourse and conversations on the computer as the source of data. Thus the 
premise is that along with its other methods, ethnographic analysis must take 
into account the data obtained from a netnographic analysis. 


Ethnography to netnographic analysis can be viewed as a continuum. 
At the one end is a face to face interaction-observation, dialogue, data 
collection, which is an ethnographic analysis. Let us say we want to study 
the world or challenges face by single mothers of autistic children. Now, let 
us say that these single mothers spend considerable time online, thus at the 
next stage we study these communities online and both the face to face and 
online methods provide us a rich understanding of their group in its entirety. 
The last stage is when we study only online communities—second life—and 
our observation are limited to only their online interaction. This method is 
called netnography. The method has its own set of peculiarities that need to 
be understood before we discuss the method of netnographic analysis. The 
first is alteration—the technology-based medium in which the interaction 
is happening is different from the traditional interaction as people move 
in and out of the platform, come back sometimes instantly and sometimes 
after days to respond to a message or communication. The second is the 
anonymous nature of the medium that lets the community member give vent 
to behavior, feelings and expression, that may never be possible in the actual 
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Different Types of Research world, however this can also be a challenge as it becomes extremely difficult 

ia ia to identify the community or even gender this person belongs to. The third 

aspect is accessibility, once part of an online community, one is privy to 

everything and anything that the person is doing in their virtual world and 

NOTES the last is that because of its very nature of storage, historical archiving of 
activity and communication is extremely easy. 


A typical netnographic analysis involves adopting a structured 
approach. 


e Step 1- Identifying the research question and objectives: Once done 
and you have identified what kind of information or knowledge that 
you seek about the community. You first need to visit sites frequented 
by the communities (secondary data) to understand their typical lingos, 
their concerns and patterns of communicating with each other. 


Step 2- Identifying and approaching the communities: Once you 
have understood them to a certain degree, the next thing is to identify 
the forums or groups on which they interact- these could be chat 
forums, bulletin board, and social networking sites. Next one needs to 
shortlist the communities that one wants to enter. It is suggested that 
one enters the groups that are interactive, active, heterogeneous and 
also the communication content is rich. 


Step 3- Ethical immersion and participation in the communities: 
At every stage in the study the researcher must follow an ethical path 
to the introduction and participation in a community. Thus the time 
when the researcher enters the community, explains the academic 
purpose of the desire to enter the community. The data collection here 
is also multi fold. It involves posting comments, posing questions, 
getting feedback, taking online initiative and taking leadership roles. 
The researcher has to decide about how the communication and online 
behaviour is to be recorded. It is advised however, that the researcher 
maintains observational field notes on these communication pieces. 


e Step 4- Data analysis and interpretation: Like any other qualitative 
method, researcher needs to make sense of the huge amount of 
conversation pieces that he has gathered and tries and discerns the 
underlying or common patterns of ideas or behaviour. This can be done 
manually, where the researcher attempts to draw categories and tries to 
establish possible relationships or links between observed attitude or 
behaviour. Please understand this is not interpretation but analysis that 
is very similar to content analysis. There are also software programs 
such as CAQDAS (computer assisted qualitative data analysis) that 
do the same analysis in terms of looking at identifying and coding 
recurrent themes. 
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e Step 5- Evaluating and interpreting netnographic data: Kozinets — Different Types of Research 
has identified 10 criteria that a netnographic analysis must meet in aad 
order to consider the findings of the analysis as an accurate ground for 
establishing accurately any characterization about the community or 
culture under study. The premise essentially being that the developed 
ideas and constructs must be distinct from each other. They should 
be grounded in some theoretical framework, allow for flexibility of 
interpretation by other researchers and be able to inspire some kind of 
applied social action with reference to the community. 


NOTES 


Today, netnography is a technique that is being applied to blogs, tweets, 
and social networking sites like face book, podcasts and videocasts. The 
technique becomes increasingly important as it is able to provide insights 
into how people think and react. The companies are able to connect with their 
customers/stakeholders better if they understand the person’s inner world. The 
third use is that the research can provide valuable means of communicating 
with these communities in a manner and language that they understand and 
believe in. 


Online data metrics 


The research process involved in an online research study is very similar to 
that conducted otherwise. However there are certain variable measurements 
that are unique to online research. It is not possible to discuss each one of 
them at length; however, an attempt is made to give the reader a substantive 
idea about what to look for and how to measure it. 


1. Cookie: Is the historical record on your computer of your visiting any 
website. Every cookie has an ID number, a domain name and an expiry 
date, thus becomes useful in tracking user behaviour. 


2. Webserver log files: Most web hosts who create the website have 
an inbuilt mechanism of storing any request made there. Thus details 
about the user who accessed your site are available to you. One can 
program the web analytic software to record the visitor information in 
the manner you wish to. 


3. Page tagging: Besides the web site one can tag individual pages on 
the website and record details of those who visited the page. As this 
is related to what we referred to as intelligent crawling where the user 
might be looking for specific information. There are free analytic 
services like Google analytics that can assist in this form of tracking. 


Key performance indicator 


Key performance indicators (KPIs) are essentially measures of outcome or the 
dependent variable and the researcher can decide what he/she wants to assess 
depending on the objective of the research study. Some Popular KPIs are: 
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Different Types of Research 1. Ad impression: This is a measure of the number of times an ad banner 
and Advancements : . 
is displayed on the Internet. 


2. Cost per thousand impressions (CPM): This model based on 
impressions or essentially awareness was the model used till 1997. 
Post that the web marketer was more concerned about the viewer and 
the company paid for being seen by the user. 


NOTES 


3. CTR: Click through rate is a percentage figure which is the ratio 
between the numbers of impressions an ad gets upon the number of 
times the ad was shown. 


4. Bounce rate: Bounce rate indicates the number of people who visit a 
website’s landing page and bounce back without browsing further. 


5. Open rate: In case some information or link was sent by e-mail. Then 
the open rate is the number of people who opened the e-mail. This 
requires the HTML or image to open and in case this has been disabled 
by the recipient it cannot be used as a metrics 


6. CTOR (click to open rate): In case a link was sent on an email then 
the CTOR measures the number of people who opened the link vs 
those who opened the e-mail. 


7. Conversion rate: This is the proportion of people who visit your site 
vs those who carry out a specific action, say, purchase. 


8. Abandonment rate: Those who start an action but quit before 
completing the required activity. Say making a payment at the payment 
gateway. 


9. Page views: the number of pages on your site viewed by a site visitor. 


10. Absolute unique visitor: The details of the visitor who visited your 
website at a unique time period- say an online promotion. 


11. New vs returning visitors: Those who arrive at the page for the first 
time vs those who have visited the site earlier. 


12. Cost per click (CPC): The ratio ofthe advertising spend vs the number 
of clicks the sponsored search or banner advertisement got. This was 
more important than CPM as a click would mean a higher probability 
that the user would convert into a purchase at the site. 


13. Transaction conversion rate (TCR): This is the ratio ofthe fixed cost 
of advertising vs the numbers of conversions post the advertisement. 


14. Take rate = CTR X TCR: Is the number of times a visitor clicks and 
then converts into a transaction. 


15. Return on ad dollars (ROA): Is a measure of total revenue made 
(TCRY cost of internet marketing. 


16. Word of mouth (WOM): this is an important metrics for evaluating 


PE E social media effectiveness = 
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Number of direct clicks + Number of clicks based on recommendation 
and Advancements 


Number of direct clicks 


These are examples of the output in terms of what is the objective of an 
online strategy. The business researcher might study either the pattern of these NOTES 
matrices across segments or communities or alternatively try to establish the 
antecedents of these as these insights are what are necessary for the business 
manager who wants to better manage his/her e-commerce activities. 


2.3 DISTINCTION BETWEEN DIFFERENT TYPES 
OF RESEARCHES 


We have already learned the meaning of different types of research in Unit 1. 
Let us recapitulate the distinctions between different major types of researches: 


e Pure vs Applied Research: Pure research is also known as fundamental 
research. It is exploratory in nature and is conducted without any 
practical end-use in mind. Pure research driven by interest, curiosity 
or intuition, and its objective is to advance knowledge and to identify/ 
explain relationships between variables. In general, applied research is 
not carried out for its own sake but in order to solve specific, practical 
questions or problems. It tends to be descriptive, rather than exploratory 
and is often based upon pure research. 


Historical vs Futuristic Research: Historical research entails 
understanding past events to predict future ones. On the other hand, 
futures research can be defined as a systematic study of possible future 
events and circumstances. 


e Synthetic vs Analytical Research: A synthetic approach to research 
looks at the research question or topic from a holistic point of view. 
Here, the researcher tries to comprehend the parts of the problem by 
looking at the whole. An analytic approach to research would look 
at a topic from a constituent point of view. The researcher tries to 
comprehend the whole phenomenon by looking at the separate parts. 


e Descriptive vs Prescriptive Research: Descriptive research is 
employed to describe characteristics of a population or phenomenon 
being studied. It does not answer questions about how/when/why the 
characteristics occurred. Prescriptive research, like Evaluative research, 
is applied rather than theoretical. It is different from evaluative research 
in that it goes a step further, beyond identifying success or performance 
or outcomes, and actually recommends solutions or new ideas. 


e Experimental vs Survey Research: Experiment and survey methods 
is highly critical in data gathering. Both types of research is employed 


to test hypotheses and come up with conclusions. Research through 
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Different Types of Research experiments entails the manipulation of an independent variable 
ia ia and measuring its effect on a dependent variable. On the other hand, 
conducting surveys often entails the use of questionnaires and/or 
interviews. While experimental method is a type of experimental 

NOTES research, survey is a type of descriptive research. 


Case vs Generic Researches: Case studies are a type of descriptive 
research which encompasses detailed analysis of a single (or limited 
number) of people or events. Case studies are typically interesting 
because of the unusualness of the case. On the other hand, generic 
research or generic qualitative research is presumed to go beyond the 
observable constructs and variables that are not visible or measurable; 
rather they have to be deduced by different techniques. 


Check Your Progress 


. What is the disadvantage of the skewed sample method? 
. List one challenge of online studies. 
. What are KPIs? 


. What is pure research also known as? 


BW N e 


2.4 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. The disadvantage of the skewed sample method is that the data 
collected, especially primary, can only be conducted on people who 
are Internet-savvy. 


2. One ofthe major challenges in online studies is designing an effective 
sampling plan and obtaining a representative sample. 


3. Key performance indicators (KPIs) are essentially measures ofoutcome 
or the dependent variable and the researcher can decide what he/she 
wants to assess depending on the objective of the research study. 


4. Pure research is also known as fundamental research. 


2.5 SUMMARY 


In the last decade, what we saw was the recognition of the Internet as 
a useful source of secondary information, such as databases and online 
resources. 


e A typical phenomenon of virtual space is that companies now have to 
face the true aspects of designing consumer centric strategies. 
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e One of the major challenges in online studies is designing an effective Different Types of Research 
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sampling plan and obtaining a representative sample. 


e Since no concrete sampling frame exists of internet users, obtaining a 


probability sample is a difficult task. NOTES 


e As is the case with traditional research process, online research also 
has the same basic two broad categories of data collection—primary 
and secondary. 


e The premise of using the primary methods and the basic nuances of 
the techniques remain the same. 


The focus group is as rich in its conduct and usefulness as it is the 
real world. Here the focus croup could be both in the form of chat or 
discussion forums—where the group members are already familiar 
with each other or else they are selected through the Internet. 


The research process involved in an online research study is very 
similar to that conducted otherwise. However there are certain variable 
measurements that are unique to online research. 


e Transaction conversion rate (TCR): This is the ratio of the fixed cost 
of advertising vs the numbers of conversions post the advertisement. 


e Historical research entails understanding past events to predict future 
ones. On the other hand, futures research can be defined as a systematic 
study of possible future events and circumstances. 


Case studies are a type of descriptive research which encompasses 
detailed analysis of a single (or limited number) of people or events. 


KEY WORDS 


e Questionnaire: It refers to a set of printed or written questions with 
a choice of answers, devised for the purposes of a survey or statistical 
study. 


Case Studies: It refers to process or records of research into the 
development of a particular person, group, or situation over a period 
of time. 


e Ethnography: It refers to the scientific description of peoples and 
cultures with their customs, habits, and mutual differences. 


e Focus Group: It refers to a group of people assembled to participate 
in a discussion about a product before it is launched, or to provide 
feedback on a political campaign, television series, etc. 


Self-Instructional 
Material 29 


Different Types of Research 


and Advancements 2.7 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


NOTES Short-Answer Questions 


1. What are blogs? 
2. Write a short-note on online surveys. 


3. What is prescriptive research? 
Long-Answer Questions 


1. Differentiate between various types of research. 
2. What is online research? Discuss its advantages and disadvantages. 


3. Examine the data collection methods for online research. 
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UNIT3 SIGNIFICANCE OF 
RESEARCH 


NOTES 
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3.0 INTRODUCTION 


In the previous unit, you were introduced to recent advancements in research. 
In this unit, the discussion on hypothesis will be done. You will learn about 
how to construct hypothesis. A hypothesis is constructed after the preliminary 
research is conducted. The hypothesis is worded in such a way that it can 
be tested in the experiment(s) and it must encompass both independent and 
dependent variables. The unit will also discuss the significance of research 
in the social sciences and concepts of induction and deduction. The process 
of research is also explained here. 


3.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Discuss the different types of variables 
e Understand the significance of research in social science 
e Examine inductive and deductive reasoning 


e Describe the process of research 


3.2 CONSTRUCTING HYPOTHESIS 


Let us begin the discussion on constructing a hypothesis by examining how tiniest 
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3.2.1 Identifying Variables 


To carry out an investigation, it becomes imperative to convert the concepts 
and constructs to be studied into empirically testable and observable variables. 
A variable is generally a symbol to which we assign numerals or values. 
A variable may be dichotomous in nature, that is, it can possess only two 
values such as male-female or customer—non-customer. Values that can only 
fit into prescribed number of categories are discrete variables, for example, 
occupations can be: Teacher (1), Civil Servant (2), Private Sector Professional 
(3) and Self-employed (4). There are still others that possess an indefinite 
set, e.g., age, income and production data. 


Variables can be further classified into five categories, depending on 
the role they play in the problem under consideration. 


e Dependent variable: The most important variable to be studied and 
analysed in research study is the dependent variable (DV). The entire 
research process is involved in either describing this variable or 
investigating the probable causes of the observed effect. Thus, this in 
essence has to be reduced to a measurable and quantifiable variable. For 
example, in the organic food study, the consumer’s purchase intentions 
and the retailers stocking intentions as well as sales of organic food 
products in the domestic market, could all serve as the dependent 
variable. 


A financial researcher might be interested in investigating the Indian 
consumers’ investment behaviour, post the recent financial slow down. 
In another study, the HR head at Cognizant Technologies would like to 
study the organizational commitment and turnover intentions of short 
and long tenure employees in the company. 


Hence, as can be seen from the above examples, it might be possible 
that in the same study there might be more than one dependent variable. 


Independent variable: Any variable that can be stated as influencing 
or impacting the dependent variable is referred to as an independent 
variable (IV). More often than not, the task of the research study is 
to establish the causality of the relationship between the independent 
and the dependent variable(s). The proposed relations are then tested 
through various research designs. 


In the organic food study, the consumers’ attitude towards healthy 
lifestyle could impact their organic purchase intention. Thus, attitude becomes 
the independent and intention the dependent variable. Another researcher 
might want to assess the impact of job autonomy and role stress on the 
organizational commitment of the employees; here job autonomy and role 
stress are independent variables. 


e Moderating variables: Moderating variables are the ones that have a 
strong contingent effect on the relationship between the independent 
and dependent variables. These variables have to be considered in the 
expected pattern of relationship as they modify the direction as well 
as the magnitude of the independent-dependent association. In the 
organic food study, the strength of the relation between attitude and 
intention might be modified by the education and the income level of 
the buyer. Here, education and income are the moderating variables 
(MVs). 


In a consulting firm, the management is looking at the option of 
introducing flexi-time work schedule. Thus, a study might need to be taken to 
see whether there will be an increase in productivity of each individual worker 
(DV) subsequent to the introduction of a flexi-time (IV) work schedule. 


In real time situations and actual work settings, this proposition might 
need to be revised to take into account other impacting variables. This second 
independent variable might need to be introduced because it has a significant 
contribution on the stated relationship. Thus, we might like to modify the 
above statement as follows: 


There will be an increase in productivity of each individual worker 
(DV) subsequent to the introduction of a flexi-time (IV) work schedule, 
especially amongst women employees (MV). 


There might be instances when confusion might arise between a 
moderating variable and an independent variable. 


Consider the following situation: 


e Proposition I: Turnover intention (DV) is an inverse function of 
organizational commitment (IV), especially for workers who have 
a higher job satisfaction level (MV). 


While another study might have the following proposition to test. 


e Proposition 2: Turnover intention (DV) is an inverse function of 
job satisfaction (IV), especially for workers who have a higher 
organizational commitment (MV). 


Thus, the two propositions are studying the relation between the same 
three variables. However the decision to classify one as independent and the 
other as moderating depends on the research interest of the decision maker. 


To understand the impact and role of the moderator variable let us 
represent the relationships graphically (Figure 3.1). Here a represents the 
effect of the independent variable (job satisfaction); b represents the effect 
of the second variable moderator variable (organizational commitment) 
and c represents the moderating effect, which is the combined effect of 
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Significance of Research the moderating variable and the independent variable on the dependent 
variable. Thus, the effect of c has to be large enough and significant enough 
(statistically) to prove the moderation hypotheses. 


e Intervening variables: An intervening variable (IVV) has a temporal 
connotation to it. It generally follows the occurrence of the independent 
variable and precedes the dependent variable. Tuckman (1972) defines 
it as ‘that factor which theoretically affects the observed phenomena but 
cannot be seen, measured, or manipulated; its effects must be inferred 
from the effects of the independent variable and moderator variables 
on the observed phenomenon.’ 


NOTES 


Job Satisfaction 


(Independent Variable — I.V.) ee 


Organizational Commitment b y Turnover Intention 
(Moderator Variable — M.V.) (Dependent Variable — D.V.) 


Job Satisfaction X 
Organizational Commitment 


Fig 3.1 Graphical representation of moderating variable: Proposition 2 


For example, in the previous case, There is an increase in job satisfaction 
(IVV) of each individual worker, subsequent to the introduction of a flexi-time 
(IV) work schedule, which eventually affects the Individual’s productivity 
(DV), especially amongst women employees (MV). Another example would 
be, the introduction of an electronic advertisement for the new diet drink (IV) 
will result in increased brand awareness (IVV), which in turn will impact 
the first quarter sales (DV).This would be significantly higher amongst the 
younger female population (MV). 


Flexi-time Work Schedule a 
(Independent 
Variable —1.V.) 


» Productivity (Outcome — D.V.) 


Job Satisfaction 
(Mediating Variable) 


Fig 3.2 Graphical Representation of Mediating Variable 


In current research terminology, the intervening variable is also called a 
mediating variable, as it mediates the strength and direction of the relationship 
between the independent and dependent variable (Figure 3.2). For example in 
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the above case, the direct effect of the predictor or the independent variable Significance of Research 
is measured by a; and the mediating impact of the mediating variable is 

represented by b. However, the point to be noted is that the independent 

variable acts on the mediating variable as represented by c. Thus, to prove a 

mediating relationship, one would expect that the effect of b would be more NOTES 

than the effect of a and that this could be proven to be significantly significant. 

The best case of mediation would be if a was zero or the predictor had no 

direct effect on the outcome variable. The impact of the mediating variable 

is assessed by the method of structural equation modelling. 


e Extraneous variables: Besides the moderating and intervening 
variables, there might still exist a number of extraneous variables 
(EVs) which could affect the defined relationship but might have 
been excluded from the study. These would most often account for the 
chance variations observed in the research investigation. For example, 
a tyrannical boss; family pressures or nature of the industry could 
impact the flexi-time impact, but since these would be applicable to 
individual cases, they might not heavily impact the direction of the 
findings. However, in case the effect is substantial, the researcher might 
try to block their effect by using an experimental and a control group 
(This concept will be discussed later in the section on experimental 
designs). 


At this stage, we can clearly distinguish between the different 
kinds of variables discussed above. An independent variable is the prime 
antecedent condition which is qualified as explaining the variance in the 
dependent variable; the intervening variable follows the occurrence of the 
independent variable and may in turn impact the dependent variable; the 
moderating variable is a contributing variable which might impact the defined 
relationship; the extraneous variables are outside the domain of the study and 
responsible for chance variations, but in some instances, their effect might 
need to be controlled. 


3.2.2 Characteristics and Functions 


Any assumption that the researcher makes on the probable direction of the 
results that might be obtained on completion of the research process is termed 
as a hypothesis. Unlike the research problem that generally takes on a question 
form, the hypotheses is always in a declarative form. The statements thus 
formulated can lend themselves to empirical enquiry. Kerlinger (1986) defines 
a hypothesis as ‘...a conjectual statement of the relationship between two or 
more variables.’ According to Grinnell (1993), ‘A hypotheses is written in 
such a way that it can be proven or disproven by valid and reliable data—it 
is in order to obtain these data that we perform our study’. 
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researcher must fulfil. These are: 


e Ahypothesis must be formulated in simple, clear, and declarative form. 
A broad hypothesis might not be empirically testable. Thus, it might 


NOTES 


be advisable to make the hypothesis unidimensional, and to be testing 


only one relationship between only two variables at a time. 


o Consumer liking for the electronic advertisement for the new diet 


drink will have positive impact on brand awareness of the drink. 


o High organizational commitment will lead to lower turnover 


intention. 


e A hypothesis must be measurable and quantifiable so that the statistical 
authenticity of the relationship can be established. 


e A hypothesis is a conjectual statement based on the existing literature 
and theories about the topic and not based on the gut feel or subjective 
judgement of the researcher. 


e The validation of the hypothesis would necessarily involve testing the 
statistical significance of the hypothesized relation. For example, the 
above two hypotheses would need to use correlation and regression 
analysis respectively to test the stated relationship. 


3.2.3 Types of Hypotheses 


The formulated hypothesis could be of two types: 


1. Descriptive hypothesis: This is simply a statement about the 
magnitude, trend or behaviour of a population under study. Based 
on past records, the researcher makes some presumptions about the 
variable under study. For example: 
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Students from the pure science background score 90—95 per cent 
on a course on Quantitative Methods. 


The current advertisement for the diet drink will have a 20—25 per 
cent recall rate. 


The attrition rate in the BPO sector is almost 33 per cent. 


The literacy rate in the city of Indore is 100 per cent. 


Significance of Research 


Management Decision Problem 


Review of 
Existing Literature 
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Discussion with 
Subject Experts 


Qualitative 
Analysis 


Organization 
Analysis 


Management Research Problem/Question 
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Fig 3.3 Problem Identification Process 


2. Relational hypothesis: These are the typical kind of hypotheses which 
state the expected relationship between two variables. While stating the 
relation if the researcher makes use of words such as increase, decrease, 
less than or more than, the hypothesis is stated to be directional or 
one-tailed hypothesis. 


For example, 


e Higher the likeability of the advertisement, the higher is the recall 
rate. 


e Higher the work exhaustion experienced by the BPO professional, 
higher is the turnover intention of the person. 


However, sometimes the researcher might not have reasonable 
supportive data to hypothesize the expected direction of the relationship. 
In this case, he or she would leave the hypothesis as non-directional 
or two-tailed. 


For example, 


e There is a relation between quality of working life and job 
satisfaction experienced by employees. 


e Ban on smoking has an impact on the cigarette sales. 


e Anxiety is related to performance. Self-instructional 
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Significance of Research The hypotheses discussed in this section are in prose form and in a 
verbal declarative sentence form. 


Check Your Progress 


1. What is a variable? 


2. What is a descriptive hypothesis? 


NOTES 


3.3 SIGNIFICANCE OF RESEARCH IN SOCIAL 
SCIENCES 


Research plays an important role in many application areas. Some of them 
are as follows: 


e Finance, budgeting and investments: This includes the following 
activities: 
o Cash flow analysis, long-range capital requirements analysis, 
investment policies and dividend policies creation 


o Creation of credit policies, credit risks and account procedures such 
as deposits and withdrawals 


Purchasing, procurement and exploration: This includes the 
following activities: 


o Determining the quantity and time of purchase of raw materials, 
machinery and the like. 


o Defining the rules for buying and supplying products under varying 
prices 


o Determining the quantities and timings of purchases of finished 
products 


o Formulating strategies for exploration and exploitation of new 
material sources 


e Production management: This includes physical distribution of 
products, facility planning and manufacturing planning. 


o Physical distribution: It is further divided into the following 
elements: 


— Location and size of the warehouses, distribution centres, retail 
outlets and so on. 


— Distribution policy 
o Facility planning: It is further divided into the following elements: 
— Production scheduling and sequencing of available resources 
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— Project scheduling and allocation of resources Significance of Research 
— Determining the optimum production mix 


o Manufacturing planning: It is further divided into maintenance 
policies and preventive maintenance NOTES 


e Research and development: It includes the following activities: 
o Determining the areas of concentration of research and development 


o Reliability and evaluation of alternative designs of research and 
development 


o Control of developed projects 
o Coordination of multiple research projects 


o Determining the time and cost requirements 


3.4 SCIENTIFIC METHOD: INDUCTION AND 
DEDUCTION 


Good business research is based on sound reasoning such as finding premises 
that are correct, testing the connections between their assumptions and facts, 
making claims that are based on adequate evidence. In the reasoning process, 
induction and deduction, observation and hypothesis testing can be combined 
in a systematic way for producing scientific results. Scientific methods are 
practised in business research to guide our approach to problem solving. 
Some of the essential tenets of the scientific methods are: 


e Observation of phenomena 

e Clearly-defined procedures, methods and variables 
e Empirically testable hypotheses 

e Ability to rule out rival hypotheses 

e Statistical justification of conclusions 

e The self-correcting process 


The researcher using this approach of ‘empiricism’ attempts to describe, 
explain and make applications by relying on the information gained through 
observation. Clearly, reasoning is pivotal to much of the researcher’s 
success, which can be conveyed through one of the two types of discourse: 
exposition or argument. Exposition consists of statements that describe 
without attempting to explain. Argument allows us to defend, challenge, 
explain, interpret and explore meaning. The two types of argument that are 
of great importance to research are: deductive thought and inductive thought. 


The second concern in formulating business research problems is the 
fact that more often than not, managers become aware of problems, seek 


information and arrive at decisions under conditions of bonded rationality. A 
Self-Instructional 
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Significance of Research concept formalized by March and Simon (1958) which implies that managers 
do not always work and take decisions in a perfectly rational sequence. The 
model says that information search or problem recognition phase like any 
other behaviour has to be motivated. Unless the manager is driven by present 
levels of dissatisfaction or by high expected value of outcomes, the process 
does not start. The next implication of the model is that in most instances, 
a manager does not have access to complete and perfect information. And 
further, the manager might try to seek reasonably convenient and quick 
information that meets minimal rather than optimal standards. 


NOTES 


Deductive thought 


Deductive thought, also called deductive logic, is the process of reasoning 
from one or more general statements regarding what is known to reach a 
logically certain conclusion. It involves using given true premises to reach 
a conclusion that is also true. Under deductive logic, a specific conclusion 
is arrived at from a general principle. If the rules and logic of deduction are 
followed, this procedure ensures an accurate conclusion. Deductive arguments 
are evaluated in terms of their validity and soundness. Deductive logic or 
reasoning is usually considered to be a skill that develops without any formal 
teaching or training. 


Using deductive reasoning, researchers come up with a conclusion 
based on facts that have already been shown to be true. Hence, their conclusion 
is always true. The facts they use to prove their conclusion deductively may 
come from accepted definitions, postulates or axioms, or previously proved 
theorems. 


This kind of logic is a culmination, a conclusion or an inference drawn 
as a consequence of certain reasoned facts. The reasons cited have to be real 
and not a figment of the researcher’s judgement and second, the deductions 
or conclusions must essentially be an outcome of the same reasons. 


Unless all probable reasons have been isolated and identified, the nature 
of the inference is incomplete. 


Inductive thought 


On the other end of the continuum is inductive thought. Here there is no strong 
and absolute cause and effect between the reasons stated and the inference 
drawn. Inductive reasoning calls for generating a conclusion that is beyond 
the facts or information stated. 


Inductive thought, also known as inductive logic or inductive reasoning, 
constructs or evaluates propositions that are abstractions of observations of 
individual instances. In inductive reasoning, a general conclusion is arrived 
at by specific examples. Inductive logic is the process of coming up with a 
conclusion based on a series of events that repeat. An example would be to 
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push a light switch up turns on the light and pushing it down turns the light Significance of Research 
off. If this is done again and again, say 100 times, it may be concluded that 
the light goes on when the switch is up and it is off when the switch is down. 
However, the conclusion may not always be true because other circumstances 
may cause the light to not go on when the switch is up, such as the light may 
burn out, the electricity may go off, etc. 


NOTES 


Thus, the fact of the matter is that inductive thought draws assumptions 
and hypothesis which could explain the phenomena observed and yet there 
could be other propositions which might explain the event as well as the 
one generated by the manager/researcher. Each one of them has a potential 
truth in it. However, we have more confidence in some over the others, so 
we select them and seek further information in order to get confirmation. 


In practice, scientific thought actually makes use of both inductive 
and deductive reasoning in a chronological order. We might question the 
phenomena by an inductive hypothesis and then collect more facts and reasons 
to deduct that the hypothesized conclusion is correct. 


Features of a Good Research Study 


In the above section we learnt that one method of arriving at solutions to 
our professional dilemmas is through research. This method of enquiry, we 
will subsequently learn can vary from the loosely structured method based 
on observations and impressions to the strictly scientific and quantifiable 
methods. However, whatever be the method of enquiry, it must adhere to 
certain historically established criteria to be termed as business research. 
For a research to be of value and to authenticate or contribute to the body of 
knowledge, we feel that it must possess the following characteristics: 


(a) It must havea clearly stated purpose that is implicit as when the purpose 
is to develop a new system of inventory management or explicit to 
establish quality standards for the service delivery model in our mobile 
eye care unit. This not only refers to the objective of the study, but also 
precise definition of the scope and domain of the study. The variables 
and constructs that are being investigated—service delivery model, 
quality standards, inventory management—need to be defined in clear 
and precise terms. 


(b) It must follow a systematic and detailed plan for investigating 
the research problem. The source from which information is to be 
collected about quality standards inventory models has to be listed. 
In case the data is to be collected from a sample of suppliers, retailers 
and pathologists for investigating the gaps in the current inventory 
model, the detailing of how representativeness of the sample to the 
total population is to be ensured along with estimated error has to be 
specified. The systematic conduction also requires that all the steps in 
the research process are interlinked and sequential in nature. 
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Significance of Research (c) The selection of techniques of collecting information, sampling 
plans and data analysis techniques must be supported by a logical 
Justification. In case you are selecting a secondary data source only or 
going for an online survey, or rather than going to pathologists going 
to the ENT specialists for your hearing aid study, the reason for doing 
so, along with a clear demonstrable link to the research purpose is an 
absolute must. 


NOTES 


(d) The results of the study must be presented in an unbiased, objective and 
neutral manner. The significant findings can, at best, be supported by 
past researches, research approach and limitation, or by expert opinion. 
The researchers’ own judgements and biases should not be revealed 
at any cost, even when the scope of the study demands providing 
recommendations. 


(e) The research that you undertake can never be fruitful if it corners or 
if it exploits the rights of the respondents. Thus, the research at every 
stage and at any cost must maintain the highest ethical standards. For 
example, for the hearing aids study, if through the survey we identify the 
pivotal influence of the pathologist in the hearing aid purchase decision; 
the pathologists could be given a commission for bad mouthing the 
competitor’s products to steer the customers towards our product even 
when there is a delay in delivery, thus improving our profits without 
any major changes implemented in the faulty inventory reporting. But 
this would be unethical. 


(f) And lastly, the reason for a structured, ethical, justifiable and objective 
approach is the fact that the research carried out by us must be replicable. 
This means that the process followed by us must be ‘reliable’, i.e., in 
case the study is carried out under similar constraints and conditions 
it should be able to reveal similar results. 


Check Your Progress 


3. What does production management include? 


4. What does exposition consist of? 


5. How should the results of the study be presented? 


3.5 PROCESS OF RESEARCH 


The process of research is implemented as a series of actions or steps that are 
essentially performed in a specific order. These actions or activities usually 
overlap each other rather than pursuing a specific sequence. A brief description 
of the steps is given as follows: 
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e Selecting the topic: The first step of a researcher is to select a topic of Significance of Research 
research. While doing so, he should restrict himself to the most potential 
topic that is open for extensive research out of several alternatives. The 


factors to be considered for topic selection are: 
NOTES 
o Relevance 


o Scope for research, i.e. the required data should be available and 
accessible 


o Contribution to knowledge in the specific field 


o Required cooperation from the research guide 


Define the research problem: There are two types of research 
problems: 


o Problems related to the state of nature 
o Problems related to the relationship of variables 


In defining the research problem, the researcher should study the 
existing literature like books and journals, available in the field with 
an interdisciplinary perspective to base his research topic on some 
reliable background. He should also concentrate on the relevance of 
the present research with the past works. 


Mention the objective of research: After selecting the topic and 
defining the research problem, the researcher should mention the 
objective of research. This means that he should explain what he aims 
to achieve through the research. His objective should also explain the 
extent to which the research work is related to the specific field. 


e Survey existing literature: To understand the basis of research, it 
is important for the researcher to review the existing literature. This 
involves: 


o Surveying the existing books available in the field 


o Reviewing other published literature like articles, journals, reports, 
conference proceedings etc. 


The researcher should then prepare his own index for a period, in 
chronological order, in addition to his consultation of various indices. 


e Development of working hypothesis: A hypothesis is an uncertain 
statement that involves the proposed answer to the problem. The 
hypothesis statement provides high priority to accountability and 
responsibility of research procedure. The solution proposed by working 
hypothesis cannot be considered as the only solution to the research 
problem. It only acts as the chain or interface between the theory and 
the research problem. It can also be considered as the point of departure. 
Hypotheses are thus the tentative statements that can either by rejected 


or accepted after the research process. Hypothesis also provides a 
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Significance of Research structure and guides the researcher in the direction he should move 
to reach the solution of the problem. The researcher must keep the 
following things in mind while formulating a working hypothesis: 


o Hypothesis can only be developed after the researcher is certain 


NOTES : i 
about the nature, extent and intensity of the problem. 


o Hypothesis should be figured out throughout the research process 
which provides appropriate structure to the problem. 


o The researcher should keep it in mind that hypothesis is only the 
tentative statements/solutions of the problem and this hypothesis 
should not be generalised much. 


o A research problem does not need to have only one hypothesis. It 
only depends on the research proposal that how many hypotheses 
are required to solve the problem. 


Preparing the research design: Once the researcher has gained enough 
knowledge about the problem statement, he needs to prepare the plan 
that will act as the outline of the investigation in research process. The 
research design consists of a series of steps that has to be carried out 
during research. 


There are two types of research design: 
o Exploratory research 
o Conclusive research 
o Descriptive research 


o Causal research 


Exploratory research: The researcher conducts exploratory research 
when the problem has not been defined or he has not gained much 
knowledge about the research problem. Exploratory research allows 
the researcher to become familiar with the problem or the concept to 
be studied. The researcher can determine the best research design, 
data collection method and selection of subjects with this research. 
Sometimes, this research can also conclude that the problem does 
not exist. Exploratory research can be quite informal and can rely on 
secondary research and qualitative research. 


Conclusive research: As specified by its name is used to provide 
information that can help the researcher to reach to conclusion or 
decision-making. This search is likely to be quantitative in nature. It 
depends on both secondary data, which is also called existing data, and 
primary research, or data that is collected for the current study only. 


o Descriptive research: Descriptive research, also called statistical 
research, provides data about the population or universe that 


has to be studied during research. Descriptive research provides 
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information about ‘who, what, when, where and how’ of a situation Significance of Research 
but it does not provide information about who caused the problem. 
The researcher can use the descriptive research when the objective 
is to provide systematic, accurate and factual description. There are 


two types of descriptive research design: NOTES 
- Observations 
- Surveys 


o Causal research: It is used to find out the variable causing certain 
behaviour. This research is applicable when the researcher has the 
knowledge of variables that are causing the problem and that are 
affected by the problem. This type of research tends to be very 
complex and the researcher may sometimes be unable to determine 
the attitude of an individual by this research. There are two types 
of causal research design: 


- Experimentation 
- Simulation 


e Determine the sample design: Often only a few items are selected 
for universal study purposes, for example, blood testing on a sample 
basis to perform census inquiry. The items selected are technically 
known as a sample. The researcher must decide the way of selecting 
a sample or decide about sample design. A sample design is a definite 
plan determined for data collection to obtain a sample from a given 
population. The various types of sample designs are as follows: 


Deliberate sampling 
Simple random sampling 
Systematic sampling 
Stratified sampling 
Quota sampling 

Cluster sampling 


Multi-stage sampling 


O O- 0O 0 © 0.0 © 


Sequential sampling 


The researcher should decide the sample design after considering the 
nature of inquiry and other related factors. Sometimes several above- 
mentioned methods of sampling are used in the same study, which is 
called mixed sampling. 


Data collection: There are a variety of ways to collect data. Primary 
data can be collected through experiments or through surveys. If the 
researcher performs an experiment, he observes some quantitative 
measurements. This helps him examine the validity of his hypothesis. 
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NOTES 
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In the case of surveys, however, the researcher can adopt one or more 
of the following ways to collect data: 


o By observation 

o Through personal interviews 
o Through telephone interviews 
o By mailing of questionnaires 
o Through schedules 


Execution of project: This is the most important step in the research 
process. The researcher should ensure that the project is performed in a 
logical way and in time. If a survey is to be carried out, steps should be 
taken to ensure that it is under statistical control, so that the collected 
data is in accordance with the pre-determined standard of accuracy. 


Analysis of data: After data collection, the researcher turns to the 
task of analysing it. The bulk data should be compressed into a few 
manageable groups and tables for further analysis. The researcher can 
then analyse the collected data by using various statistical measures. 


Hypothesis testing: After analysing the data, the researcher should 
test the hypothesis, if any. He should check if the facts support the 
hypothesis or are contrary. Statisticians have developed tests like Chi 
square test, t-test and F-test for hypothesis testing. This testing further 
results in either acceptance or rejection of hypothesis. 


Generalisations and interpretations: The real value of research lies 
in its ability to arrive at certain generalisations. If the researcher cannot 
find a hypothesis to start with, he might seek to explain his findings on 
the basis of some theory. This is called interpretation. This may give 
rise to new questions and lead to further research. 


Preparation of report or thesis: This is the concluding step of research, 
where the researcher has to prepare a report of what has been done 
by him. Generally, the report should be designed in accordance to the 
following layout: 


o The preliminary pages: Here the title, date, acknowledgements 
and foreword with the table of contents, should be mentioned. 


o The main text: This should be divided into introduction, summary, 
main report and conclusion. 


o End matter: This should contain appendices, bibliography and 
index. 


A report should be written in a precise and objective style in simple 


language. Charts and illustrations should be included to lay emphasis on the 
study of research. 


Establishing Operational Definitions Significance of Research 


Having identified and defined the variables under study, the next step requires 
operationalizing the stated relationship in the form ofa theoretical framework. 

This is an outcome of the problem audit conducted prior to defining the NOTES 
research problem; it can be best understood as a schema or network of the 

probable relationship between the identified variables. Another advantage 

of the model is that it clearly demonstrates the expected direction of the 

relationships between the concepts. There is also an indication of whether 

the relationship would be positive or negative. 


This step however is not mandatory as sometimes the objective of the 
research is to explore the probable variables that might explain the observed 
phenomena (DV) and the outcome of the study helps to theorize and propose 
a conceptual model. 


The theoretical framework, once formulated, is a powerful driving force 
behind the research process and ought to be comprehensively developed. It 
requires a thorough understanding of both theory and opinion. 


‘The specific way in which a variable is measured in a particular 
study is called the operational definition. It is critical to operationally define 
a variable in order to lend credibility to the methodology and to ensure the 
reproducibility of the results. Another study may measure the same variable 
differently. The operational definition also helps to control the variable by 
making the measurement constant. Therefore, when it comes to operational 
definitions of a variable, the more detailed the definition is, the better.’ 


Check Your Progress 


6. When is exploratory research conducted? 


7. List the ways of collecting data. 


3.6 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. Avariable is generally a symbol to which we assign numerals or values. 


2. Descriptive hypothesis is simply a statement about the magnitude, 
trend or behaviour of a population under study. 


3. Production management includes physical distribution of products, 
facility planning and manufacturing planning. 


4. Exposition consists of statements that describe without attempting to 
explain. 
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Significance of Research 5. The results of the study must be presented in an unbiased, objective 
and neutral manner. 


6. The researcher conducts exploratory research when the problem has not 
been defined or he has not gained much knowledge about the research 


ee problem. 


7. The researcher can adopt one or more of the following ways to collect 
data: 


e By observation 

e Through personal interviews 
e Through telephone interviews 
e By mailing of questionnaires 
e Through schedules 


3.7 SUMMARY 


To carry out an investigation, it becomes imperative to convert the 
concepts and constructs to be studied into empirically testable and 
observable variables. 


A variable may be dichotomous in nature, that is, it can possess only 
two values such as male-female or customer—non-customer. 


Variables can be further classified into five categories, depending on 
the role they play in the problem under consideration. These include: 
o Dependent variable 

o Independent variable 
o Moderating variables 
o Intervening variables 
o 


Extraneous variables 


Any assumption that the researcher makes on the probable direction 
of the results that might be obtained on completion of the research 
process is termed as a hypothesis. 


Unlike the research problem that generally takes on a question form, 
the hypotheses is always in a declarative form. The statements thus 
formulated can lend themselves to empirical enquiry. 


A hypothesis must be measurable and quantifiable so that the statistical 
authenticity of the relationship can be established. 


Good business research is based on sound reasoning such as finding 
premises that are correct, testing the connections between their 
assumptions and facts, making claims that are based on adequate 
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e Scientific methods are practised in business research to guide our Significance of Research 
approach to problem solving. 


e Argument allows us to defend, challenge, explain, interpret and explore 
meaning. The two types of argument that are of great importance to 


: . . NOTES 
research are: deductive thought and inductive thought. 


Deductive thought, also called deductive logic, is the process of 
reasoning from one or more general statements regarding what is 
known to reach a logically certain conclusion. 


On the other end of the continuum is inductive thought. Here there is 
no strong and absolute cause and effect between the reasons stated and 
the inference drawn. 


The research must have a clearly stated purpose that is implicit as when 
the purpose is to develop a new system of inventory management or 
explicit to establish quality standards for the service delivery model 
in our mobile eye care unit. 


The reason for a structured, ethical, justifiable and objective approach 
is the fact that the research carried out by us must be replicable. This 
means that the process followed by us must be ‘reliable’, i.e., in case 
the study is carried out under similar constraints and conditions it 
should be able to reveal similar results. 


e The process of research is implemented as a series of actions or steps 
that are essentially performed in a specific order. These actions or 
activities usually overlap each other rather than pursuing a specific 
sequence. 


e A hypothesis is an uncertain statement that involves the proposed 
answer to the problem. The hypothesis statement provides high priority 
to accountability and responsibility of research procedure. 


e Having identified and defined the variables under study, the next 
step requires operationalizing the stated relationship in the form of 
a theoretical framework. This is an outcome of the problem audit 
conducted prior to defining the research problem. 


The theoretical framework, once formulated, is a powerful driving 
force behind the research process and ought to be comprehensively 
developed. 


KEY WORDS 


Objective: It refers to something not influenced by personal feelings 
or opinions in considering and representing facts. 


e Deductive Reasoning: Something that is characterized by or based 


on the inference of particular instances from a general law. $ = 
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e Inductive Reasoning: It is a method of reasoning in which the premises 
are viewed as supplying some evidence for the truth of the conclusion. 
The truth of the conclusion of an inductive argument may be probable, 
based upon the evidence given. 


e Proposition: It is a statement or assertion that expresses a judgement 
or opinion. 


3.9 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 


1. Discuss the significance of research in the social sciences. 
2. What is good business research based on? 

3. List the essential tenets of the scientific method. 

4. Write a short-note on inductive reasoning. 


5. How is theoretical framework formulated? 
Long-Answer Questions 


1. Describe the various classification of variables. 

2. Examine the different types of hypothesis. 

3. Explain inductive and deductive reasoning in detail. 
4. Discuss the features of a good research study. 


5. Examine the various steps involved in the research process. 
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4.0 INTRODUCTION 


In this unit, you will learn about the planning of research. Before a research 
work is carried out, a lot of planning is required. Proper planning helps in 
performing research work with much ease. This unit focusses on the basic 
concept of planning a research which is essential while conducting research 
for a specific purpose. Ideas such as research problems are discussed. Research 
problems are questions that indicate gaps in the scope or the certainty of our 
knowledge. Discovering a problem puts the research process into action and 
identification of the purpose is the first step towards the solution. 


41 OBJECTIVES 


After going through this unit, you will be able to: 
e Define a research problem 
e Describe how a research problem is identified, selected and formulated 


e Discuss review of literature in the field of business 


4.2 RESEARCH PROBLEM 


Research problems are questions that indicate gaps in the scope or the 
certainity of our knowledge. They point either to the problematic phenomena, 
observed events that are puzzling in terms of the accepted ideas, or to 
problematic theories, current ideas that are challenged by new hypothesis. 
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Defining the Research Problem 


Problem discovery puts the research process into action and identification 
of the problem is the first step towards its solution. Properly and completely 
defining a business problem is easier said than done. Actually, the research 
task may be to define or evaluate an opportunity or to clarify a problem. 
The definition and discovery of the research problem is viewed under this 
broader context. In research, often, only symptoms are apparent to begin 
with. The adage ‘a problem well defined is a problem half solved’ is worth 
remembering. The investigation gets a sense of direction with an orderly 
definition of the research problem. A careful attention to the problem 
definition allows a researcher to set the proper research objectives. When 
the purpose of research is clear, the chances of collecting the relevant and 
necessary information are greater. However, just because a problem has 
been discovered or an opportunity has been recognized does not mean that 
the problem has been defined. A problem definition indicates a specific 
managerial decision area to be clarified or a particular problem to be solved. 
It specifies research questions to be answered and the objectives of the 
research. 


4.2.1 Identification, Selection and Formulation of Research Problem 


The first and the most important step of the research process is to identify 
the path of enquiry in the form of a research problem. It is like the onset of 
a journey, in this instance the research journey, and the identification of the 
problem gives an indication of the expected result being sought. A research 
problem can be defined as a gap or uncertainty in the decision makers’ 
existing body of knowledge which inhibits efficient decision making. 
Sometimes it may so happen that there might be multiple reasons for these 
gaps and identifying one of these and pursuing its solution, might be the 
problem. As Kerlinger (1986) states, ‘If one wants to solve a problem, one 
must generally know what the problem is. It can be said that a large part of 
the problem lies in knowing what one is trying to do.’ The defined research 
problem might be classified as simple or complex (Hicks, 1991). Simple 
problems are those that are easy to comprehend and their components and 
identified relationships are linear and easy to understand, e.g., the relation 
between cigarette smoking and lung cancer. Complex problems on the other 
hand, talks about interrelationship between antecedents and subsequently 
with the consequential component. Sometimes the relation might be further 
impacted by the moderating effect of external variables as well, e.g., the effect 


of job autonomy and organizational commitment on work exhaustion, at the Planning Research 
same time considering the interacting (combined) effect of autonomy and 
commitment. This might be further different for males and females. These 
kinds of problems require a model or framework to be developed to define 


the research approach. NOTES 


Thus, the significance of a clear and well-defined research problem 
cannot be overemphasized, as an ambiguous and general issue does not 
lend itself to scientific enquiry. Even though different researchers have 
their own methodology and perspective in formulating the research topic, 
a general framework which might assist in problem formulation is given as 
follows: 


Problem identification process 


The problem recognition process invariably starts with the decision maker 
and some difficulty or decision dilemma that he/she might be facing. This is 
an action oriented problem that addresses the question of what the decision 
maker should do. Sometimes, this might be related to actual and immediate 
difficulties faced by the manager (applied research) or gaps experienced in 
the existing body of knowledge (basic research). The broad decision problem 
has to be narrowed down to information oriented problem which focuses 
on the data or information required to arrive at any meaningful conclusion. 
Given in Figure 4.1 is a set of decision problems and the subsequent research 
problems that might address them. 


Management decision problem 


The entire process explained above begins with the acknowledgement and 
identification of the difficulty encountered by the business manager/researcher. 
If the manager is skilled enough and the nature of the problem requires to be 
resolved by him or her alone, the problem identification process is handled by 
him or her, else he or she outsources it to a researcher or a research agency. 
This step requires the author to carry out a problem appraisal, which would 
involve a comprehensive audit of the origin and symptoms of the diagnosed 
business problem. For illustration, let us take the first problem listed in the 
Figure 5.1. An organic farmer and trader in Uttarakhand, Nirmal farms, wants 
to sell his organic food products in the domestic Indian market. However, 
he is not aware if this is a viable business opportunity and since he does not 
have the expertise or time to undertake any research to aid in the formulation 
of the marketing strategy, he decides to outsource the study. 
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DECISION PROBLEM RESEARCH PROBLEM 


What should be done to increase What is the awareness and purchase 
the customer base of organic products intention of health conscious consumers 
in the domestic market? for organic products? 


What is the impact of shift duties 
on work exhaustion and turnover 
intentions of the BPO employees? 


How to reduce turnover 
rates in the BPO sector? 


How to improve the delivery How does Widex/industry leader 
process of Widex hearing aids in India? manage its supply chain in India/Asia? 


Should th ti ith What is the satisfaction level of 

it ou ti e ee conunue s the company with the existing vendor? 

nS EXS life Sele SEn CES NEnOON Are there any gaps? Can they be effectively 
or look at an alternative? handled by the vendor? 


What is the current investment in Real Estate 
and Housing? Can the demand in the sector 
be forecasted for the next six months? 


Can the Housing and real estate 
growth be accelerated? 


What has been the Leadership initiatives 
and performance record of ABC viz. XYZ? 


Can a leading aggressive private sector bank 


Whom should ICICI choose as its 


next Managing director- Mr ABC or Mrs. XYZ? 


accept a woman as its leader? 


*However, the transgression for the first to the second column is not an easy task and requires 
a Sequential stepwise approach (presented in Fig. 2.3) 


Fig. 4.1 Converting Management Decision Problem into Research Problem 


Discussion with subject experts 


The next step involves getting the problem in the right perspective through 
discussions with industry and subject experts. These individuals are 
knowledgeable about the industry as well as the organization. They could be 
found both within and outside the company. The information on the current 
and probable scenario required is obtained with the assistance of a semi- 
structured interview. Thus, the researcher must have a predetermined set of 
questions related to the doubts experienced in problem formulation. It should 
be remembered that the purpose of the interview is simply to gain clarity on 
the problem area and not to arrive at any kind of conclusions or solutions to 
the problem. For example, for the organic food study, the researcher might 
decide to go to food experts in the Ministry for Food and Agriculture or 
agricultural economists or retailers stocking health food as well as doctors 
and dieticians. This data however is not sufficient in most cases while in 
other cases, accessibility to subject experts might be an extremely difficult 


task as they might not be available. The information should, in practice, Planning Research 
be supplemented with secondary data in the form of theoretical as well as 
organizational facts. 


Review of existing literature NOTES 


A literature review is a comprehensive compilation of the information obtained 
from published and unpublished sources of data in the specific area of interest 
to the researcher. This may include journals, newspapers, magazines, reports, 
government publications, and also computerized databases. The advantage 
of the survey is that it provides different perspectives and methodologies to 
be used to investigate the problem, as well as identify possible variables that 
may need to be investigated. Second, the survey might also uncover the fact 
that the research problem being considered has already been investigated 
and this might be useful in solving the decision dilemma. It also helps in 
narrowing the scope of the study into a manageable research problem that is 
relevant, significant and testable. 


Once the data has been collected from different sources, the researcher 
must collate all information together in a cogent and logical manner instead of 
just listing the previous findings. This documentation must avoid plagiarism 
and ensure that the list of earlier studies is presented in the researcher’s own 
words. The logical and theoretical framework developed on the basis of past 
studies should be able to provide the foundation for the problem statement. 


The reporting should cite clearly the author and the year of the study. 
There are several internationally accepted forms of citing references and 
quoting from published sources. The Publication Manual of the American 
Psychological Association (2001) and the Chicago Manual of Style (1993) 
are academically accepted as referencing styles in management. 


To illustrate the significance of a literature review, given below is a 
small part of a literature review done on organic purchase. 


Research indicates organic is better quality food. The pesticide residue 
in conventional food is almost three times the amount found in organic food. 
Baker et al. (2002) found that on an average, conventional food is more than 
five times likely to have chemical residue than organic samples. Pesticides 
toxicity has been found to have detrimental effects on infants, pregnant 
women and general public (National Research Council, 1993; Ma et al., 2002; 
Guillete et al., 1998) Major factors that promote growth in organic market 
are consumer awareness of health, environmental issues and food scandals 
(Yossefi and Willer, 2002). 


This paragraph helps justify the relevance and importance of organic 
versus non organic food products as well as identify variables that might 
contribute positively to the growth in consumption of organic products. 
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Organizational analysis 


Another significant source for deriving the research problem is the industry 
and organizational data. In case the researcher/investigator is the manager 
himself/herself, the data might be easily available. However, in case the study 
is outsourced, the detailed background information of the organization must 
be compiled, as it serves as the environmental context in which the research 
problem has to be defined. It is to be remembered at this juncture that the 
organizational context might not be essential in case of basic research, where 
the nature of study is more generic. 


This data needs to include the organizational demographics—origin and 
history of the firm; size, assets, nature of business, location and resources; 
management philosophy and policies as well as the detailed organizational 
structure, with the job descriptions. 


Qualitative survey 


Sometimes the expert interview, secondary data and organizational 
information might not be enough to define the problem. In such a case, an 
exploratory qualitative survey might be required to get an insight into the 
behavioural or perceptual aspects of the problem. These might be based on 
small samples and might make use of focus group discussions or pilot surveys 
with the respondent population to help uncover relevant and topical issues 
which might have a significant bearing on the problem definition. 


In the organic food research, focused group discussions with young 
and old consumers revealed the level of awareness about organic food and 
consumer sentiments related to purchase of more expensive but a healthy 
alternative food product. 


Management research problem 


Once the audit process of secondary review and interviews and survey is 
over, the researcher is ready to focus and define the issues of concern, that 
need to be investigated further, in the form of an unambiguous and clearly- 
defined research problem. Once again it is essential to remember that simply 
using the word ‘problem’ does not mean there is something wrong that has 
to be corrected, it simply indicates the gaps in information or knowledge 
base available to the researcher. These might be the reason for his inability 
to take the correct decision. Second, identifying all possible dimensions of 
the problem might be a monumental and impossible task for the researcher. 
For example, the lack of sales of a new product launch could be due to 
consumer perceptions about the product, ineffective supply chain, gaps in 
the distribution network, competitor offerings or advertising ineffectiveness. 
It is the researcher who has to identify and then refine the most probable 
cause of the problem and formalize it as the research problem. This would 
be achieved through the four preliminary investigative steps indicated above. 


4.2.2 Identification Objectives of the Research 
Let us discuss the research objectives. 
Statement of research objectives 


Next, the research question(s) that were formulated need to be broken down 
and spelt out as tasks or objectives that need to be met in order to answer 
the research question. 


Based on the framework of the study, the researcher has to numerically 
list the thrust areas of research. This section makes active use of verbs such 
as ‘to find out’, ‘to determine’, ‘to establish’, and ‘to measure’ so as to spell 
out the objectives of the study. In certain cases, the main objectives of the 
study might need to be broken down into sub-objectives which clearly state 
the tasks to be accomplished. 


In the organic food research, the objectives and sub-objectives of the 
study were as follows: 


1. To study the existing organic market: This would involve: 


e To categorize the organic products available in Delhi into grain, 
snacks, herbs, pickles, squashes and fruits and vegetables; 


e To estimate the demand pattern of various products for each of the 
above categories; 


e To understand the marketing strategies adopted by different players 
for promoting and propagating organic products. 


2. Consumer diagnostic research: This would entail: 


e To study the existing consumer profile, i.e., perception and attitudes 
towards organic products and purchase and consumption patterns; 


e To study the potential customers in terms of consumer segments, 
level of awareness, perception and attitude towards health and 
organic products; 


3. Opinion survey: To assess the awareness and opinions of experts 
such as doctors, dieticians and chefs in order to understand organic 
consumption and propagation; 


4. Retail market: This would involve: 
e To find the gap between demand and supply for existing retailers; 


e To forecast demand estimates by considering the existing as well 
as potential retailers. 


Thus, the research problem formulation involves the following 
interrelated steps: 


e Ascertaining the objectives of the decision-maker 


e Understanding the problem’s background 
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e Identifying and isolating the problem, rather than its symptoms 

e Determining the unit of analysis 

e Determining the relevant variables 

e Stating the research objectives and research questions (hypotheses). 


The above-mentioned process ensures that the real research objectives/ 
questions are identified for the proposed research. 


4.2.3 Statement of Research Problem and Cost and Value Information 


Both the decision-makers and researchers expect that the problem definition 
efforts should result in a statement of the research problem or research 
objectives. On completion of the exercise of formulating the research 
problem, the researcher must prepare a written statement(s) that clarifies any 
ambiguity about what s/he hope the research will accomplish. Writing a series 
of research questions and hypotheses can add clarity to the statement of the 
business problem. These research questions are the researcher’s translation 
of the business problem into a specific need for inquiry; and hypothesis is an 
unproven proposition that tentatively explains certain facts or phenomena, a 
proposition that is empirically testable. In other words, research objectives/ 
hypotheses explain the purpose of research in measurable terms and define 
standards what the research should accomplish. 


Values and Cost of Information 


The value and cost of information play an essential role in estimating the 
importance of information as well as the total expenditure be done for buying 
the information. 


Value of information 


Human beings have evolved in a way that they can appreciate the role of 
information in their life without much effort. The initial phase in human 
civilization, has taught us to appreciate instinctively the importance of 
information and communication. We are instinctively alert to information. 
However, the human brain has evolved to understand that information has 
different degrees of ‘value’ (which the brain unconsciously rates). Information 
can be defined as processed data, which helps in decision-making and/or 
facilitates communication within an organization. More often, information 
provides answers to ‘who’, ‘what’, ‘where’, and ‘when’ type of questions. 
The human brain prioritizes information, according to its perceived value 
(most often this unconscious valuation mechanism in our brain is correct, 
more so in the case of instinct-based information). 


For example, let us assume that a driver notices a child suddenly 
crossing the road and calculates that he will hit the child unless he stops and 
at the same time, he feels an itching sensation on his forehead. In this case, 


the brain of the driver prioritizes two different information received from 
different sensory inputs. It reacts by sending a signal to the driver’s right foot 
to press the brake pedal to stop the car and only after the car stops will the 
brain react to the itch. We unconsciously do this every day. Evolution has 
taught us that information has a context and hence different degrees of value. 


The principal objective of research is to find solutions to problems 
systematically. In general, the objectives of value of information with respect 
to research can be specified as follows: 


e To extend the knowledge of human beings, environment, and natural 
phenomenon to others. 


e To bring the information which is not developed fully during ordinary 
course of life? 


e To verify existing facts and identify the changes into these existing 
facts. 


e To develop facts for critical evaluation. 


e To analyse inter-relationships between variables and deriving casual 
explanations. 


e To develop new tools and techniques that study unknown phenomenon. 
e To help in planning and development. 
e To acquire familiarity with a phenomenon. 


e To study the frequency of connection or independence of any activity 
or occurrence. 


e To determine the characteristics of an individual or a group of activities 
and the frequency of occurrence of these activities. 


e To test a hypothesis about a casual relationship that exists between 
variables. 


The value of information is determined based on the benefits that are 
derived from the information. Consider an example where two products A 
and B are developed. The benefits derived from product A evaluates to 20 and 
the benefits derived from product B evaluates to 30. The difference between 
the benefits of the two products is 10 units. 


If you add some information, the benefits derived from product A 
increases by 20 points from 20 to 40. The actual value of information needs to 
be calculated from simple mathematics. The cost of information increases by 
20 units. You need to subtract the cost involved in obtaining the information, 
to determine the actual value of the information. 


Cost of information 


The cost of information determines the cost involved in obtaining the 
information, which includes: 
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e Cost of acquiring the data. 

e Cost of maintaining the data. 

e Cost of generating the information. 

e Cost of communicating the information. 


The cost is estimated from the point the information is generated, to 
the point the information is retrieved. The cost of obtaining accurate and 
complete information is more as compared to the cost generally retrieved 
from the system. 


4.3 REVIEW OF LITERATURE IN THE FIELD OF 
BUSINESS 


A literature review is the presentation, classification and evaluation of what 
other researchers have written on a particular subject. A literature review may 
form part of a research thesis, or may stand alone as a separate document. 
Although the second of these types of literature review is less extensive than 
that expected for a thesis, the skills required are identical. A literature review 
is not simply a shopping list of what others have said. It does not and cannot 
refer to every piece of literature in the field. Rather, a literature review is 
organized according to a particular research objective. It is a conceptually 
organized synthesis which ultimately provides a rationale for further research, 
whether by you or by others. The few basic purposes that a business literature 
should fulfil are the following, 


e Compare and contrast different authors’ views on an issue 
e Group authors who draw similar conclusions 

e Criticize aspects of methodology 

e Note areas in which authors are in disagreement 

e Highlight exemplary studies 

e Highlight gaps in research 


Two essential elements of all literature reviews (though they are not 
formally identified as such) are: 


1. An outline on what others have done in your chosen area 


2. A progressive narrowing to the gap in the research 

Economic management areas covered under literature review 

The various economic areas covered under literature review are as follows: 
e Economic growth and development 


e Economics of organizations and industries 


e Econometrics 


e Economic policy Planning Research 
e Economic theory 

e Environmental and agricultural economics 
e Financial economics NOTES 
e Game theory and mathematical methods 

e History of economic thought 

e International economics 

e Law and economics 

e Monetary economics 

e Industrial organization 

e Public finance 

e R&D and technology policy 

e Regional and social policy 

e Labour economics 

e Population economics 

e Political economy 

e Development economics 

e Managerial economics 

e Financial psychology 

e Economic geography 

e Real estate economics 

e Energy economics 

e Green economics 

e Computational economics 

e Behavioural economics 


e Socioeconomics 


Business management areas covered under literature review 


The various economic areas covered under literature review are as follows: 
e Accounting 
e Finance 
e Strategic management 
e Educational management 
e Operations management 
e Production management 
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e Human resource management 

e Marketing 

e Organizational analysis and planning 
e Policy making and decision making 
e Ethics in business 

e Motivation 

e Globalization 

e Training and development 

e Recruitment and selection 

e Industrial relations 

e Virtual technology 

e Change management 

e Entrepreneurship 

e Organizational behaviour 

e Sustainable business practices 

e Total quality management 

e Supply chain 

e Project management 

e Political business strategy 

e Innovation management 


In a literature review, the work of others is used to cast the gap in 
relief (that is, to make it clear). The research question and thesis statement 
are then stated precisely before the remainder of the research project (refer 
see Appendix given at the end of this unit). 


Searching for papers included two databases with a wealth of business 
literature: ABI/ProQuest and EBSCOhost Business Source Complete. These 
include peer-reviewed academic journals but also other sources, such as 
newspapers, magazines and trade publications. 


Check Your Progress 


1. What are research problems? 
2. What is a literature review? 
3. How is the value of information determined? 


4. List three economic areas covered under literature review. 
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QUESTIONS 


1. Research problems are questions that indicate gaps in the scope or the NOTES 
certainity of our knowledge. 


2. A literature review is a comprehensive compilation of the information 
obtained from published and unpublished sources of data in the specific 
area of interest to the researcher. 


3. The value of information is determined based on the benefits that are 
derived from the information. 


4. Three economic areas covered under literature review are as follows: 
e Economic growth and development 
e Economics of organizations and industries 


e Econometrics 


45 SUMMARY 


Problem discovery puts the research process into action and 
identification of the problem is the first step towards its solution. 


e The significance of a clear and well-defined research problem cannot 
be overemphasized, as an ambiguous and general issue does not lend 
itself to scientific enquiry. 


e A problem definition indicates a specific managerial decision area to 
be clarified or a particular problem to be solved. It specifies research 
questions to be answered and the objectives of the research. 


e Another significant source for deriving the research problem is the 
industry and organizational data. 


Sometimes the expert interview, secondary data and organizational 
information might not be enough to define the problem. In such a case, 
an exploratory qualitative survey might be required to get an insight 
into the behavioural or perceptual aspects of the problem. 


Once the audit process of secondary review and interviews and survey 
is over, the researcher is ready to focus and define the issues of concern, 
that need to be investigated further, in the form of an unambiguous and 
clearly-defined research problem. 


Based on the framework of the study, the researcher has to numerically 
list the thrust areas of research. 


The cost of information determines the cost involved in obtaining the 


information. 
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e The various economic areas covered under literature review are as 


follows: 
o Accounting 
o Finance 


o Strategic management 


4.6 KEY WORDS 


Research Problem: It is a statement about an area of concern, a 


condition to be improved, a difficulty to be eliminated, or a troubling 
question that exists in scholarly literature, in theory, or in practice 
that points to the need for meaningful understanding and deliberate 


investigation. 


e Research Objectives: They refer to the description of what is to be 


achieved by the study. 


e Audit: It means an official inspection of an organization’s accounts, 


typically by an independent body. 


e Hypothesis: It refers to a supposition or proposed explanation made on 
the basis of limited evidence as a starting point for further investigation. 


e Thesis: It is a statement or theory that is put forward as a premise to 


be maintained or proved. 


4.7 SELF ASSESSMENT QUESTIONS AND 


EXERCISES 


Short-Answer Questions 


1. How does the researcher define the research problem? 


2. What does a literature review include? 


3. List the economic areas covered under literature review. 


Long-Answer Questions 


1. Examine how the research problem is identified, selected and 


formulated. 


2. Discuss review of literature in business. 


Self-Instructional 
64 Material 


3. Describe how research objectives are identified. 
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5.0 INTRODUCTION 


In this unit, you will learn about economic management and hypothesis. 
According to Theodorson, ‘a hypothesis is a tentative statement asserting a 
relationship between certain facts.’ Kerlinger describes it as ‘a conjectural 
statement of the relationship between two or more variables’. Hypothesis 
is more useful when stated in precise and clearly defined terms. A good 
hypothesis implies that hypothesis which fulfils its intended purposes and is 
up to the mark. The unit will go on to discuss research design. 


It has been found by research scholars and managers alike that most 
research studies do not result in any significant findings because of a faulty 
research design. Most researchers feel that once the problem is defined and 
hypotheses are made, one can go ahead and collect the data on a specified 
group, or sample, and then analyse it using statistical tests. However, unless 
the formulated research problem and the study hypotheses is tested through 
a well-defined plan, answers are going to be based on hit and trial rather than 
any sound logic. 


The design approaches available to the researcher are many and will 
depend on whether the study is of descriptive or conclusive nature. The 
designs range from very simple, loosely structured to highly scientific 
experimentation. Just as experiments in science, in business research also 
there are chances of error and this needs to be understood and controlled for 
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5.1 OBJECTIVES 


After going through this unit you will be able to: 
e Discuss research gaps 
e Describe the different types of hypothesis testing 


e Explain the various steps in a research design 


5.2 USE IN IDENTIFYING RESEARCH GAPS AND 
TECHNIQUES 


When a researcher is working on original research, he would like to identify 
a need for his research somewhere close to the beginning of his paper. Why 
is it so? Because he would like to show the reader that he is not duplicating 
existing research. He does this by surveying the current research and then 
identifying a gap that he is going to fill. The researcher identifies the broad 
problem and states its importance. He also states what is significant in what 
has already been written. He describes the gap he proposes to fill in the 
existing research literature. This then creates an opportunity for him to make 
a contribution to the research in the area. 


Thus, the process of developing a research proposal is ultimately one of 
establishing a gap in current research which the researcher aims to address. As 
a result, the function of the researcher’s research proposal and the literature 
review chapter of his thesis is to convince the audience that this research gap 
does exist, and that his research is valid and significant. The principle aim 
of the researcher is to assess the gaps in research with respect to his area of 
research, review current work being carried out in relation to these gaps, and 
recommend the most fruitful areas for his. 

The characteristics of research gap may be summarized as follows: 

e It is what makes the researcher’s manuscript publishable. 
e Itis the missing element in the existing research literature. 


e Itis the gap that the researcher will fill with his research approach. 


5.3 HYPOTHESIS: MEANING, SOURCES, TYPES 
OF HYPOTHESIS, AND HYPOTHESIS 
TESTING 


The term ‘hypothesis’ is derived from the ancient Greek term hypotithenai 
which means to put under or to suppose. There are several characteristics of 
hypotheses, which are as follows: 
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able to draw a consistent conclusion. 


Statement of relationship between variables: If a hypothesis 
is relational, it should state the relationship between the different 


TE ; 
NOTE? variables. 


Testability: A hypothesis should be open to testing so that other 
deductions can be made from it and can be confirmed or disproved by 
observation. The researcher should do some prior study to make the 
hypothesis testable. 


Specific with limited scope: A hypothesis which is specific with limited 
scope is more easily testable than a hypothesis with limitless scope. 
Therefore, a researcher should spend more time to conduct research 
on such a kind of hypothesis. 


Simplicity: A hypothesis should be stated in the most simple and clear 
terms to make it understandable. 


Consistency: A hypothesis should be reliable and consistent with 
established facts. 


e Time limit: A hypothesis should be capable of being tested within 
a reasonable time. In other words, the excellence of a hypothesis is 
judged by the time taken to collect the data needed for the test. 


e Empirical reference: A hypothesis should explain or support all the 
sufficient facts needed to understand what the problem is all about. 


Hypothesis Testing: Parametric and Non-Parametric Tests 


Hypothesis testing means to determine whether or not the hypothesis is 
appropriate. This involves either accepting or rejecting a null hypothesis. 
The researcher has to pursue certain activities contained in the procedure 
of hypothesis. 


Hypotheses: Null and Alternative 


A hypothesis is an approximate assumption that a researcher wants to test for 
its logical or empirical consequences. A hypothesis refers to a provisional idea 
whose merit needs evaluation, but has no specific meaning. It is often referred 
to as a convenient mathematical approach for simplifying a cumbersome 
calculations. Setting up and testing hypotheses is an integral art of statistical 
inference. Hypotheses are often statements about population parameters like 
variance and expected value. During the course of hypothesis testing, some 
inferences about the population like mean and proportion are made. Any 
useful hypothesis will enable predictions by reasoning, including deductive 
reasoning. A hypothesis might predict the outcome of an experiment in a 
lab setting involving the observation of a phenomenon in nature. Thus, a 


Self-Instructional 
68 Material 


hypothesis is an explanation of a phenomenon proposal suggesting a possible Economic Management 
correlation between multiple phenomena. 


For the purpose of decision-making, a hypothesis has to be verified 
and then accepted or rejected. This is done with the help of observations. 
Decision-making plays a significant role in different areas such as marketing, 
industry and management. Testing a statistical hypothesis on the basis of a 
sample enables us to decide whether the hypothesis should be accepted or 
not. The sample data enables us to accept or reject the hypothesis. 


NOTES 


Null Hypothesis and Alternative Hypothesis 


In the context of statistical analysis and research, while comparing any two 
methods, the following concepts or assumptions are taken into consideration: 


e Null Hypothesis: While comparing two different methods in terms of 
their superiority, wherein the assumption is that both the methods are 
equally good is called null hypothesis. It is also known as statistical 
hypothesis and is symbolized as H, 


e Alternate Hypothesis: While comparing two different methods 
regarding their superiority, wherein, stating a particular method to be 
good or bad as compared to the other one is called alternate hypothesis. 
It is symbolized as H.. 


Note 1: A test provides evidence, if any, against a hypothesis, usually called 
a null hypothesis. The test cannot prove the hypothesis to be correct. It can 
give some evidence against it. 


The test of hypothesis is a procedure to decide whether to accept or 
reject a hypothesis. 


Note 2: The acceptance of hypotheses implies that if there is no evidence 
from the sample, we should believe otherwise. 


The rejection of a hypothesis leads us to conclude that it is false. This 
way of putting the problem is convenient because of the uncertainty inherent 
in the problem. In view of this, we must always briefly state a hypothesis that 
we hope to reject. A hypothesis stated in the hope of being rejected is called 
a null hypothesis and is denoted by H. 


If H, is rejected, it may lead to the acceptance of an alternative 
hypothesis denoted by H. 


To take an example, suppose a new fragrant soap is introduced in the 
market. The null hypothesis H, which may be rejected, is that the new soap 
is not better than any existing soap. 


Similarly, a dice is suspected to be rolled. Roll the dice a number of 
times to test. 


By the Null Hypothesis H,, p = 1/6 for showing six. 
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By the Alternative Hypothesis H, p # 1/6. 


A hypothesis is usually considered the principal instrument in research. 
The basic concepts regarding the testability of a hypothesis are now discussed. 


Comparison of Null Hypothesis and Alternate Hypothesis 


Following are the points of comparison between the null hypothesis and 
alternate hypothesis: 


e Null hypothesis is always specific while alternate hypothesis gives an 
approximate value. 


e The rejection of null hypothesis involves great risk, which is not the 
case with alternate hypothesis. 


e Null hypothesis is more frequently used in statistics than alternate 
hypothesis because it is specific and is not based on probabilities. 


Procedure for hypothesis testing 


The procedure for hypothesis testing is as follows: 


1. Making formal statement: In this step, the nature of a hypothesis 
is clearly stated, which could be either null hypothesis or alternate 
hypothesis. Stating a problem in hypothesis testing is of utmost 
importance, which should be done with proper care, keeping in mind 
the object and nature of the problem. 


2. Choosing a significance level: In this step, a hypothesis is tested on 
the basis of a present significance level, which has to be adequate in 
terms of nature and purpose of the problem. 


3. Sampling distribution: In this step, determining an appropriate 
sampling distribution and making a choice between normal distribution 
and t-distribution are included. 


4. Random selection of a sample: In this step, a random sample is 
selected from the sample data for determining an apt value. 


5. Probability calculation: In this step, the probability regarding viability 
of the sample result is made dependent on the null hypothesis. 


6. Comparison: In this step, the calculated probability and the value of 
alpha in case of one-tailed test and alpha/2 in case of two-tailed test is 
compared. 


Check Your Progress 


1. What is the process of developing a research proposal? 
2. What is null hypothesis? 


5.4 RESEARCH DESIGN 


Research design is a structure that gives an outline of the overall research 
work. It is the result of better planning and implementation of a good strategy. 
Different authors have given different definitions of a research design. 
According to Kerlinger, research design is the plan, structure and strategy 
of investigation conceived so as to obtain answers to research questions and 
to control variance. Bernard Phillips defines research design as the blueprint 
for collection, measurement and analysis of data. 


Green et al. (2008) defines research designs as ‘the specification of 
methods and procedures for acquiring the information needed. It is the 
overall operational pattern or framework of the project that stipulates what 
information is to be collected from which sources by what procedures. If it 
is a good design, it will insure that the information obtained is relevant to 
the research questions and that it was collected by objective and economical 
procedures.’ 


The decisions that you need to take to formulate a research design 
should be based on the following questions: 


e What is the research all about? 

e Why is the research being done? 

e What kind of data is required for the research? 
e From where can the data be obtained? 

e How much time will the research take? 

e What is a sample research design? 

e How should the data be analysed? 

e What is the style of report preparation? 


A research design helps a researcher to organize ideas and check 
for flaws and inadequacies in the collected data. It involves the following 
elements: 


e A statement that clearly defines the problem for which the research is 
being done 


e Procedures and techniques for gathering the information required for 
research design 


e Methods that need to be implemented for processing and analysing the 
data required for research design 


The overall research design can be divided into the following four parts: 


e Sampling part: It includes the method of selecting items that are to 
be observed for the research study. 
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e Observational part: It includes the conditions under which you need 
to make observations. 


e Statistical part: It is based on the number of items that need to be 
observed and the analysis technique to be used for the analysis of 
gathered data. 


e Operational part: It involves the techniques that help to implement the 
items specified in the sampling, statistical and observational designs. 


Need for research design 


Before starting the research process, the formulation of an efficient and 
appropriate research design is important. A research design is significant as 
it has the following advantages: 


e It helps in the smooth functioning of various research operations. 
e It requires less effort, time and money. 


e It helps to decide the methods and techniques to be used for collecting 
and analysing data. 


The researcher needs to consider the following factors before creating 
a research design: 


e Source of the information 

e Skills of the researcher and his coordinating staff 
e Problem objectives 

e Nature of the problem 


e Availability of time and money for the research work 
Features of a good research design 


A good research design is characterized by flexibility, efficiency and low 
cost, but it has many other features too. On the basis of the description of 
the design, a research design has the following features: 


e It states the sources and types of information required for solving the 
problem for which the research is being carried out. 


e It is a strategy for indicating the approach to be adopted for gathering 
and analysing data. 


e It includes performing research work according to time and budget 
constraints. 


e It minimizes preconception and maximizes the reliability of collected 
and analysed data. 


e It minimizes experimental errors in an investigation. 


e It provides various aspects for dealing with a problem. 


A research design depends to a large extent on the type of research Economic Management 

study that you are conducting. If the research study is exploratory, then major 

emphasis is on the discovery of ideas. So, a research design should be flexible 

to implement the different aspects of a phenomenon. However, when the 

purpose is to obtain an accurate description of a research study, the design NOTES 

that maximizes reliability of the collected data is considered a good design. 

The availability of time, money, skills of the research staff and the method 

of obtaining information must be considered while creating experimental 

design, survey design and sample design. 


Steps in research design 
The steps in a research design primarily depend on the type of research being 
conducted. The steps involved in a research process are as follows: 
1. Preparing the research question or problem 
. Assessing the available literature 
. Creating hypotheses 
. Constructing the research design 
. Collecting data 
. Analysing the data 


. Interpreting the results 


oN DA nN BW N 


. Writing the research report 


The fourth step, i.e., constructing the research design, involves three 
subordinate steps, which include the process of creating a research design. 
The three subordinate steps can further be explained as follows: 


(i) Identifying variables: This involves identifying the variables to 
be studied and determining their types. The most common types of 
variables are dependent, independent, controlled and other variables. 
Dependent variables are items such as responses of subjects and 
outcomes of survey or criterion variables. Independent variables, on 
the other hand, are those, which are explanatory or predictor variables. 


(ii) Formulating functional definitions: Here, the researcher explores the 
possibilities and the ways in which the variables can be operationalized. 


(iii) Selecting design for data analysis: This is the preliminary step of data 
collection, and hence, involves determination of what design option to 
choose for analysing the data being collected. 


Types of Research Design 
Several research designs are classified on the basis of the study performed 


in the research. These research designs can be listed as follows: 
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e Research design in exploratory research studies 
e Research design in descriptive studies 

e Research design in quantitative studies 

e Research design in qualitative studies 


e Research design in experimental research studies 
1. Exploratory Research Design 


Exploratory research design is also known as formulative research design. 
In this research design, a specific subject is investigated. It helps to generate 
a set of hypotheses or research-based questions that can be used at a later 
stage. The three methods that are applied for explorative research studies 
are as follows: 


e Surveying the literature: It is the simplest method for formulating 
the research problem in which along with new literature, previous 
hypotheses are reviewed and evaluated for future research. 


Experience survey: It is a type of research that involves practically 
experienced persons in the research work. For such a survey, people 
with more innovative ideas are carefully selected as respondents and 
then the investigators interview the respondents. Thus, experience 
survey enables the researcher to concisely define the problem. This 
survey also provides information about the practical possibilities for 
different research works. 


Analysis of insight-stimulating examples: It includes an intensive 
study of selected instances of a phenomenon. In this method, the attitude 
of the investigator, intensity of study and ability of the researcher are 
required to unify the diverse information of the problem. 


Thus, in exploratory research study, the applied method needs to be 
flexible, regardless of the type of the method, so that the different aspects of 
the problem can be considered. In exploratory research design, the following 
considerations are kept in mind: 


e A small sample size is used. 
e Data requirements are unclear. 
e General objectives are considered, rather than specific objectives. 


e No definite suggestions are made after research analysis. 
2. Descriptive Research Design 


A descriptive research study describes the characteristics of a particular 
problem or an individual or a group. Descriptive studies include specific 
predictions concerned with study, facts and characteristics concerning an 
individual, a group or situations. Most of the social research is based on 


descriptive research studies. In descriptive studies, the questions related to Economic Management 
‘what’, ‘why’, ‘where’ and ‘who’ need to be answered. 


The following steps must be followed while designing a descriptive 


study: NOTES 


1. Formulating the objectives of the study: This step specifies the 
objectives to ensure that the collected data is related to the study, 
otherwise the research will not provide the desired result. 


2. Designing the data collection methods: This step helps to select the 
method, that is, observation, questionnaires, interview or examination 
of records, for collecting the data. 


3. Processing and analysing the data: The data collected for the research 
study must be processed and analysed. This includes analysing the data 
collected through interviews and observations, tabulating the data and 
performing statistical computations. 


4. Reporting the researched data: For reporting the findings, the layout 
should be well planned, and presented in a simple and effective style. 


In descriptive studies, the following considerations should be kept in 
mind: 


e The phenomenon under study should be described. 


e The data may be related to the behavioural variables of the 
respondent. 


e The recommendations are definite. 


e The objectives should be specific, data requirements should be clear 
and large samples should be used. 


Descriptive research design requires a clear specification of ‘when’, 
‘where’, ‘who’, ‘what’, ‘why’, and ‘how’ of the research. Its main purpose 
is to describe the characteristics or the function. Some of the conditions in 
which this research can be recommended are: 


e To make a specific forecast 
e Discovery of associations among variables 


e Estimates of the proportions ofa population that have some specific 
characteristics. 


e To describe the characteristics of product, group, organization or 
market. 


Unlike exploratory research, the descriptive research design is marked 
by a specific hypotheses, clear statement of the problem and detailed 
information needs. Generally, descriptive research follows surveys, panels, 
secondary data analysis and observation methods and can be classified into 
cross-sectional and longitudinal research. 
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Cross-sectional research: This is the most frequently used research design 
in business research and involves information collection from a given sample 
of population elements, and that too only once. They may be either multiple 
cross-sectional or single cross-sectional. In single cross-sectional designs, 
only one sample of respondents is drawn from the target population, and 
the information from this sample is obtained only once. This design is also 
referred to as sample survey research design. 


In multiple cross-sectional design, there are two or more samples of 
respondents, and the information from each of the sample is obtained only 
once. Often, information from different samples is obtained at different times 
over long intervals. Multiple cross-sectional designs allow comparisons at the 
aggregate level but not at the individual respondent level. Because a different 
sample is taken each time, a survey is conducted, there is no way to compare 
the measures on an individual respondent across surveys. One of the special 
interest, multiple cross-sectional design is cohort analysis, which consists of 
a series of surveys conducted at appropriate time intervals, where the cohort 
serves as the basis unit of analysis. A group of respondents who experience 
the same event within the same time interval is referred to as a ‘cohort’. 


Longitudinal research design: Unlike cross-sectional research design, a 
fixed sample(s) of population elements is measured repeatedly on the same 
variable. In other words, the same objects are studied over time and the 
same variables are measured. In contrast to the cross-sectional design, which 
provides a snapshot of the variables of interest at a single point in time, a 
longitudinal study gives a series of pictures that provide an in-depth view of 
the situation and the changes that have taken place over time. Sometimes, the 
term panel is used interchangeably with the term longitudinal design. A panel 
consists of a sample of respondents who have agreed to give information at 
specified intervals over an extended period. 


Causal research design: This research design is used to obtain the evidence 
of cause-and-effect (causal) relationships. Like descriptive research design, 
causal research design also requires a plan and structure and is more 
appropriate for the following purposes: 


e To understand cause (independent) variables and effect (dependent) 
variables of the phenomenon 


e To determine the nature of the relationship between cause and effect 
variables to make predictions about effect 


In this design, causal (independent) variables are manipulated in a 
relatively controlled environment, in which the other variables that may 
affect the dependent variable are controlled or checked as much as possible. 
The effect of this manipulation on one or more dependent variables is 
then measured to infer causality. The main method of causal research is 
experimentation. 


3. Diagnostic/Conclusive Research Design Economic Management 


A conclusive research design is more structured and formal than an 

exploratory research design. It is based on large representative samples, 

and the data obtained is subjected to quantitative analysis. The aim of NOTES 
conclusive research is to examine specific relationships and test specific 

hypotheses. To achieve these objectives, the researcher needs to clearly 

specify the required information. In this research, the findings are considered 

as conclusive in nature as they are used as inputs for managerial decision- 

making. The two categories of conclusive research designs are descriptive 

and causal. Descriptive research designs can further be either cross-sectional 

or longitudinal. 


4. Experimental Research Design 


Experimental research design is usually applicable when we are determining 
the cause and effect relationship or deriving the cause and effect inferences in 
any experimental research study. Experimental research design is instrumental 
in answering some of the important psychological questions that are based 
on the concept of what causes what. 


The objective of experimental research design is to establish the cause 
and effect relationship between variables. The four types of variables related 
to experimental research design are as follows: 


e Independent variables: These signify conditions or measures in the 
experimental design that can be changed. 


e Dependent variables: These variables can be measured and signify 
the effect or result in the experimental design. 


e Control variables: These remain constant in the experimental design. 


e Random variables: These can vary their values in different conditions 
in the experimental design. 


There are many variations in experimental designs, which are created 
to achieve different results and resolve different problems. We can define 
the simplest form of experimental design by creating two similar groups, 
which are equivalent to each other in all respects, except for the fact that 
one group will receive the treatment and another group will not receive the 
treatment. The group that receives the treatment can be termed as the treatment 
group and the group that does not receive the treatment can be termed as the 
comparison or control group. 


The formation of two similar groups that are equivalent to each other is 
ensured by randomly assigning people or participants into two groups from a 
common pool of people or participants. The success of the experiment is based 
on the concept of random assignment of people into two groups. However, as 
two people cannot be exactly similar, in the experimental design, we refer to 
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the idea of probability and say that two groups are probabilistically equivalent 
or equivalent in the probabilistic ranges. 


5.4.1 


Factors Affecting Research Design 


Some of the factors that affect research design are as follows: 


e Accessibility of scientific information 


e Availability of sufficient data, time, money and manpower 


e Exposure to various sources of data 


5.4.2 


Extent of the problem that needs to be resolved with the help of research 
Support of the top management of the company or organization 


Knowledge, skills and ability of the researcher 


Evaluation of Research Design 


Following points need to be considered for evaluating research design 


Determining the nature of the research that needs to be evaluated 


Checking the relevance and reliability of the sources that are cited in 
the research proposal 


Checking whether the research design conforms to the standards of 
scientific research 


Identifying whether the research design is semi-experimental, 
experimental, descriptive or correlational 


Finding the ethical problems that may arise in the research design 


Reviewing existing literature to find out if similar or same research 
has been done before 


NH Nn A W 


. How does Bernard Phillips define research design? 
. List the steps in the research process. 
. What is the objective of experimental research design? 


. List two factors that affect research design. 


Check Your Progress 


ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


. The process of developing a research proposal is ultimately one of 


establishing a gap in current research which the researcher aims to 
address. 


2. While comparing two different methods in terms of their superiority, Economic Management 
wherein the assumption is that both the methods are equally good is 
called null hypothesis. It is also known as statistical hypothesis and is 


symbolized as H,. 


l l NOTES 
3. Bernard Phillips defines research design as the blueprint for collection, 


measurement and analysis of data. 
4. The steps involved in a research process are as follows: 
(i) Preparing the research question or problem 
(ii) Assessing the available literature 
(iii) Creating hypotheses 
(iv) Constructing the research design 
(v) Collecting data 
(vi) Analysing the data 
(vii) Interpreting the results 
(viii) Writing the research report 


5. The objective of experimental research design is to establish the cause 
and effect relationship between variables. 


6. Two factors that affect research design are as follows: 
e Accessibility of scientific information 


e Availability of sufficient data, time, money and manpower 


5.6 SUMMARY 


e When a researcher is working on original research, he would like to 
identify a need for his research somewhere close to the beginning of 
his paper. 

e The researcher identifies the broad problem and states its importance. 
He also states what is significant in what has already been written. He 
describes the gap he proposes to fill in the existing research literature. 


e The term ‘hypothesis’ is derived from the ancient Greek term 
hypotithenai which means to put under or to suppose. 


e Ahypothesis should be open to testing so that other deductions can be 
made from it and can be confirmed or disproved by observation. The 
researcher should do some prior study to make the hypothesis testable. 


e A hypothesis refers to a provisional idea whose merit needs evaluation, 
but has no specific meaning. 


e The rejection of a hypothesis leads us to conclude that it is false. This 
way of putting the problem is convenient because of the uncertainty 
inherent in the problem. Selfänstrucñonäl 
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of a good strategy. 


TE : 
ee low cost, but it has many other features too. 


e A good research design is characterized by flexibility, efficiency and 


Several research designs are classified on the basis of the study 


performed in the research. These research designs can be listed as 


follows: 


o Research design in exploratory research studies 


Research design in descriptive studies 


o 

o Research design in quantitative studies 
o Research design in qualitative studies 
o 


Research design in experimental research studies 


5.7 KEY WORDS 


Research Gap: It is a research question or problem which has not 


been answered appropriately or at all in a given field of study. 


e Random Selection: It refers to how sample members (study 
participants) are selected from the population for inclusion in the study. 


Sample Distribution: It is a probability distribution of a statistic 


obtained through a large number of samples drawn from a specific 


population. 


Research Design: It is the set of methods and procedures used in 


collecting and analyzing measures of the variables specified in the 


research problem research. 


5.8 SELF ASSESSMENT QUESTIONS AND 


EXERCISES 


Short-Answer Questions 


1. What are the characteristics of the research gap? 


2. Discuss the characteristics of a hypothesis. 


3. What are the different parts of the research design? 


Long-Answer Questions 


1. Explain null and alternative hypothesis. 


2. What is hypothesis testing? Discuss its procedure. 


pay insiucional 3. Examine the various types of research design. 
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6.2 Meaning of Sampling Design 
6.2.1 Principle of Sampling and Essentials of Good Sampling 
6.2.2 Sampling Concepts and Sampling Frame 

6.3 Census Method and Sampling Method for Investigation 
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6.5 Summary 

6.6 Key Words 

6.7 Self Assessment Questions and Exercises 

6.8 Further Readings 


6.0 INTRODUCTION 


In the previous unit, you learnt about the economic management and 


hypothesis. In this unit, we will turn towards sampling design. 


While conducting a research, collecting a sample is of utmost 
importance. However, simply collecting a sample is not enough. There is a 
certain plan for obtaining a sample from the sampling frame. Proper planning 
and designing is very much essential for carrying out a survey. This unit 
focuses on the importance of sample design; its principles, essentials of good 
sampling and various methods involved in an investigation. Sampling design 
refers to the technique or procedure adopted by a researcher in selecting some 
samples or sampling units from where inferences about population are drawn. 


6.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Discuss the essentials of good sampling 
e Evaluate the need for sampling 
e Analyse the census method and sampling method 


6.2 MEANING OF SAMPLING DESIGN 


Sampling design refers to a definite plan for obtaining a sample from the 
sampling frame. It refers to the technique or procedure, which a researcher 
adopts in selecting some sampling units from where inferences about 
population are drawn. Sampling data is obtained before collecting the final 
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Need for Sampling Sampling Design 


We can define sampling as the process of obtaining information about an 
entire population by examining only a part of it. Sampling is required for 
the following reasons: NOTES 


e It saves time and money. A sample study is usually less expensive than 
a census study. 


e It produces results at a faster speed. 


e It enables more accurate measurement for a sample study as it is 
conducted by experienced investigators. 


e It is the only method for an infinitely large population. 


e It usually enables you to estimate sampling errors and thus assists 
you in obtaining information concerning some characteristics of the 
population such as age group or gender. 


The advantages of sampling are as follows: 


e The solution to know the true or actual values of the various parameters 
of the population would be to take into account the entire population. 
This is not feasible due to the cost and time involved. Therefore, 
sampling seems more economical. 


e As the magnitude of operation involved in a sample survey is small, 
the execution of the fieldwork and the analysis of results can be carried 
out at a faster rate and in a lesser time. 


e Only a small staff is required for gathering and analysing information 
and preparing reports. Therefore, sampling is a very cheap process. 


e A researcher can collect detailed information in a lesser time than is 
possible in a census survey. 


e As the scale of operation involved in a sample survey is small, the 
quality of interviews supervision and other related activities is better 
than the census survey. 


e Sampling provides adequate information needed for the purpose and 
is sufficiently reliable for surveys. 


Characteristics of Sampling 


Usually, sampling involves determining a property or attribute to adhere 
to for the purpose of differentiating between items of a given population. 
These attributes, which are the objects of study, are called characteristics. 
The process of distinguishing the items is usually of two types, quantitative 
or qualitative. In quantitative sampling, characteristics pertaining to variables 
are dealt with. On the other hand, qualitative sampling is concerned with the 
characteristics related to attributes. 
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The basic idea behind sampling is to use the common characteristics 
of average items as samples for a larger entity. Thus, it involves choosing a 
subset of population elements for study. Thus, for example, if the population 
to be dealt with is, say that of roads, then the characteristics could be length, 
duration, roughness, carriage capacity, etc. Sampling proves to be a much 
cheaper and quicker mode of estimation where the population is absolutely 
huge. 


However, it is absolutely necessary to take ample care while determining 
which characteristics should be sampled. Those characteristics, which are 
rare, should be avoided. Similarly, even if there are certain very common 
characteristics, which, however, do not contribute in any way to draw reliable 
estimates, then such characteristics should not be sampled. 


6.2.1 Principle of Sampling and Essentials of Good Sampling 


After understanding various concepts related to sampling and sampling 
design, let us now look at the principles and essentials of sampling: 


e Unbiased: One of the primary principles of sampling is that it should 
not be biased. 


e Adequate sampling size: For accurate sampling, it is important that 
the size of sample is adequate. 


e Standardized samples: Samples should be standardized so that they 
can be checked for relevance and accuracy. 


e Statistical Regularity: According to this principle, the units of the 
sample must be selected at random. 


Uses of sampling in real life 


In our day-to-day life we make use of the concept of sampling. There is 
hardly any person who has not made use of the concept in a real-life situation. 
Consider the following examples: 


e Suppose you go to a grocery shop to purchase rice. You have been 
instructed by your mother to purchase good quality rice. On reaching 
the grocery shop you have the choice of buying the rice from any one 
of three bags. What is generally done is that you pick up a handful of 
rice from each bag, examine its quality and then decide about which 
bag’s rice is to be bought. The concept of sampling is being used here 
as a handpick from each bag is a sample and examining the quality is 
a process by which you are trying to assess the quality of all the rice 
in the bag. 


Suppose you have a guest for dinner at your residence. Your mother 
prepares a number of dishes and before the guest arrives, she may give 
you a tablespoon of each of the dish to taste and tell her whether all 
the ingredients are in the right proportion or not. Again, a sample is 
being taken from each of the dish to know how each of them tastes. 


e You go to a bookshop to buy a magazine. Before you decide to buy 
it, you may flip through its pages to know whether the contents of the 
magazines are of interest to you or not. Again, a sample of pages is 
taken from the magazine. 


6.2.2 Sampling Concepts and Sampling Frame 


Before we get into the details of various issues pertaining to sampling, it 
would be appropriate to discuss some of the sampling concepts. 


Population: Population refers to any group of people or objects that form 
the subject of study in a particular survey and are similar in one or more 
ways. For example, the number of full-time MBA students in a business 
school could form one population. If there are 200 such students, the 
population size would be 200. We may be interested in understanding their 
perceptions about business education. If there are 200 class IV employees in 
an organization and we are interested in measuring their job satisfaction, all 
the 200 class IV employees would form the population of interest. If a TV 
manufacturing company produces 150 TVs per week and we are interested 
in estimating the proportion of defective TVs produced per week, all the 
150 TVs would form our population. If, in an organization there are 1000 
engineers, out of which 350 are mechanical engineers and we are interested 
in examining the proportion of mechanical engineers who intend to leave 
the organization within six months, all the 350 mechanical engineers would 
form the population of interest. If the interest is in studying how the patients 
in a hospital are looked after, then all the patients of the hospital would fall 
under the category of population. 


Element: An element comprises a single member of the population. Out of 
the 350 mechanical engineers mentioned above, each mechanical engineer 
would form an element of the population. In the example of MBA students 
whose perception about the management education is of interest to us, each 
of the 200 MBA students will be an element of the population. This means 
that there will be 200 elements of the population. 


Sampling frame: Sampling frame comprises all the elements of a population 
with proper identification that is available to us for selection at any stage of 
sampling. For example, the list of registered voters in a constituency could 
form a sampling frame; the telephone directory; the number of students 
registered with a university; the attendance sheet of a particular class and 
the payroll of an organization are examples of sampling frames. When the 
population size is very large, it becomes virtually impossible to form a 
sampling frame. We know that there is a large number of consumers of soft 
drinks and, therefore, it becomes very difficult to form the sampling frame 
for the same. 


Sample: It is a subset of the population. It comprises only some elements 
of the population. If out of the 350 mechanical engineers employed in 
an organization, 30 are surveyed regarding their intention to leave the 
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organization in the next six months, these 30 members would constitute the 
sample. 


Sampling unit: A sampling unit is a single member of the sample. If a 
sample of 50 students is taken from a population of 200 MBA students in 
a business school, then each of the 50 students is a sampling unit. Another 
example could be that if a sample of 50 patients is taken from a hospital to 
understand their perception about the services of the hospital, each of the 50 
patients is a sampling unit. 


Sampling: It is a process of selecting an adequate number of elements 
from the population so that the study of the sample will not only help in 
understanding the characteristics of the population but will also enable us to 
generalize the results. We will see later that there are two types of sampling 
designs—probability sampling design and non-probability sampling design. 
Census (or complete enumeration): An examination of each and every 
element of the population is called census or complete enumeration. Census 
is an alternative to sampling. We will discuss the inherent advantages of 
sampling over a complete enumeration later. 


Check Your Progress 


1. What is ‘sample design’? 
2. State any one advantage of sampling. 


3. What is a sampling unit? 


6.3 CENSUS METHOD AND SAMPLING METHOD 
FOR INVESTIGATION 


In a research study, we are generally interested in studying the characteristics 
of a population. Suppose in a town there are 2 lakh households and we are 
interested in estimating the proportion of those households who spend their 
summer vacations in a hill station. This information can be obtained by asking 
every household in that town. If all the households in a population are asked to 
provide information, such a survey is called a census. There is an alternative 
way of obtaining the same information by choosing a subset of all the two 
lakh households and asking them for the same information. This subset is 
called a sample. Based upon the information obtained from the sample, a 
generalization about the population characteristic could be made. However, 
that sample has to be representative of the population. For a sample to be 
a representative of the population, the distribution of sampling units in the 
sample has to be in the same proportion as the elements in the population. For 


example, if in a town there are 50, 35 and 15 per cent households in lower, Sampling Design 
middle and upper income groups, then a sample taken from this population 
should have the same proportions in for it to be representative. There are 


several advantages of sample over census. 


NOTES 
e Sample saves time and cost. Consider as an example that we are 


interested in estimating the monthly average household expenditure 
on food items by the people of Delhi. It is known that the population 
of Delhi is approximately 1.2 crore. Now, if we assume that there 
are five members per household, it would mean that the population 
comprises approximately 24 lakh households. Collecting data on the 
expenditure of each of the 24 lakh households on food items would be 
a very time-consuming and expensive exercise. This is because you 
will need to hire a number of investigators and train them before you 
conduct the survey on the 24 lakh households. Instead, if a sample of, 
say, 2000 households is chosen, the task would not only be finished 
faster but will be in expensive, too. 


Many times a decision-maker may not have too much of time to wait 
till all the information is available. Therefore, a sample could come to 
his rescue. 


There are situations where a sample is the only option. When we want 
to estimate the average life of fluorescent bulbs, what is done is that 
they are burnt out completely. If we go for a complete enumeration 
there would not be anything left for use. Another example could be 
testing the quality of a photographic film. To test the quality, we need 
to expose it completely and the moment it is exposed it gets destroyed. 
Therefore, sample is the only choice. 


The study of a sample instead of complete enumeration may, at times, 
produce more reliable results. This is because by studying a sample, 
fatigue is reduced and fewer errors occur while collecting the data, 
especially when a large number of elements are involved. 


A census is appropriate when the population size is small, e.g., the 
number of public sector banks in the country. Suppose the researcher is 
interested in collecting information from the top management of a bank 
regarding their views on the monetary policy announced by the Reserve 
Bank of India (RBI), in this case, a complete enumeration may be possible 
as the population size is not very large. As another example, consider a 
business school having a few students from Europe, East Africa, South East 
Asia and the Middle East. These students would have their own problems in 
settling down in the Indian environment because of the differences in social, 
cultural and environmental factors. To understand their concerns, a survey 
of population may be more appropriate. Therefore, a survey of population 
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could be used when there is a lot of heterogeneity in the variables of interest 
and the population size is small. 


Methods of Sampling 


Sampling is important as the researcher has to collect data from all cases. 
There are different types of sampling techniques/ methods which researcher 
needs to understand before selecting the proper sampling method for the 
research. Generally, a market research study requires two essential types of 
sampling, namely probability sampling and non-probability sampling which 
will be discussed in detail in the next unit. 


Check Your Progress 


4. State any one advantage of sample over census. 


5. When is census appropriate? 


6.4 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. A definite plan for obtaining a sample from the sampling frame is 
known as sample design. It also encompasses the technique, adopted 
by a researcher in selecting some sampling units to carry out an 
investigation. 


2. Any one advantage of sampling is that the solution to know the true or 
actual values of the various parameters of the population would be to 
take into account the entire population. This is not feasible due to the 
cost and time involved. Therefore, sampling seems more economical. 


3. A sampling unit is a single member of the sample. 


4. Sample saves time and cost. There are situations where a sample is the 
only option. If we want to estimate the average life of fluorescent bulbs, 
they are burnt out completely. In such cases, sampling scores over census. 


5. Acensus is appropriate when the population size is small. For example, 
one wants to calculate the number of public sectors banks in the country. 


6.5 SUMMARY 


A definite plan for obtaining a sample from the sampling frame is 
termed as sampling design. It refers to the technique or procedure, 
which a researcher adopts in selecting some sampling units from where 
inferences about population are drawn. 


Sampling is required for obtaining information because it saves time 
and money. It is less expensive and produces results at a faster speed. 


e Sampling involves determining a property or attribute to adhere to for Sampling Design 
the purpose of differentiating between items of a given population. 
The basic idea behind sampling is to use the common characteristics 


of average items as samples for a larger entity. NOTEŠ 
e Population refers to any group of people or objects that form the subject 


of study in a particular survey and are similar in one or more ways. 


e An element comprises a single member of the population while a 
sample is a subset of the population. It comprises only some elements 
of the population. 


e Sampling is a process of selecting an adequate number of elements 
from the population so that the study of the sample will not only help 
in understanding the characteristics of the population but will also 
enable us to generalize the results. 


e Census is an examination of each and every element of the population 
is called census or complete enumeration. 


e Acensus is appropriate when the population size is small, for example, 
the number of public sector banks in the country. 


6.6 KEY WORDS 


e Quota Sampling: It is a sample which includes a minimum number 
from each specified subgroup in the population. 


e Sampling Design: It refers to a definite plan for obtaining a sample 
from the sampling frame. 


e Stratified Random Sampling: It is a method of sampling that involves 
the division of a population into smaller groups known as strata. In 
stratified random sampling, or stratification, the strata are formed based 
on members’ shared attributes or characteristics. 


e Sampling Frame: It comprises all the elements of a population with 
proper identification that is available to us for selection at any stage 
of sampling. 


6.7 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 


1. Define sample design. Why do we need sample design? 
2. State any one characteristic of sampling. 
3. What is the census method? 


4. What do you understand by sampling frame? Self-Instructional 
Material 89 
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Long-Answer Questions 


1. Differentiate between census method and sampling method. 
2. Discuss the need and advantages of sampling in detail. 


3. Explain the advantages of sample over census. 
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INTRODUCTION 


In the previous unit, you were introduced to the concept of sampling design. 
A sample design is made up of two elements, that is, a sampling method and 
an estimator. In this unit, the discussion will turn towards how to determine 
sample size, the factors affecting the size of a sample as well as sampling 
and non-sampling errors. 


7.1 


OBJECTIVES 


After going through this unit, you will be able to: 


e Describe how to determine sample size 


e Examine sampling and non-sampling errors 


e Discuss biased sample 


e Discuss the different methods of sampling 
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7.2 METHODS OF SAMPLING: PROBABILITY, 
NON-PROBABILITY AND MIXED SAMPLING 
DESIGN (OR SYSTEMATIC SAMPLING) 


Sampling design refers to the process of selecting samples from a population. 
There are two types of sampling designs—probability sampling design and 
non-probability sampling design. Probability sampling designs are used in 
conclusive research. Ina probability sampling design, each and every element 
of the population has a known chance of being selected in the sample. The 
known chance does not mean equal chance. Simple random sampling is 
a special case of probability sampling design where every element of the 
population has both known and equal chance of being selected in the sample. 
In case of non-probability sampling design, the elements of the population do 
not have any known chance of being selected in the sample. These sampling 
designs are used in exploratory research. 


Probability Sampling Design 


Under this, the following sampling designs would be covered—simple 
random sampling with replacement (SRSWR), simple random sampling 
without replacement (SRSWOR), systematic sampling, stratified random 
sampling and cluster sampling. 


Simple random sampling with replacement 


Under this scheme, a list of all the elements of the population from where the 
samples to be drawn is prepared. If there are 1000 elements in the population, 
we write the identification number or the name of all the 1000 elements on 
1000 different slips. These are put in a box and shuffled properly. If there 
are 20 elements to be selected from the population, the simple random 
sampling procedure involves selecting a slip from the box and reading of 
the identification number. Once this is done, the chosen slip is put back to 
the box and again a slip is picked up and the identification number is read 
from that slip. This process continues till a sample of 20 is selected. Please 
note that the first element is chosen with a probability of 1/1000, the second 
one is also selected with the same probability and so are all the subsequent 
elements of the population. 

An alternative way of selecting the samples from the population is 


by using random number tables. Table 7.1 gives an illustrative example of 
random numbers. 


Table 7.1 Select Four-digit Random Numbers Probability and Non-Prob- 
ability Sampling Methods, 


I II Il IV V Sample Size and Sampling 
— z0 045 Ina 7871 9559 and Non-sampling Errors 
8016 5732 3448 0164 2367 
1322 4678 8034 1139 1474 NOTES 
0843 4625 7407 9987 5734 
2364 1187 4565 2343 9786 
4885 8755 4355 5465 0575 
3406 4678 5950 7222 8494 
5927 6010 7545 8979 1041 
4447 3476 9140 0736 2332 
4968 7553 1073 2493 4251 
7489 1630 2330 4250 6170 
4010 2707 3925 6007 8089 
6531 9784 5520 7764 0008 
7052 3861 7115 9521 2192 
6573 2793 8710 2127 3846 
8094 3205 2030 3035 5765 
8615 6092 1900 4792 7684 
9136 4016 3495 6549 9603 
9656 5246 5090 8306 1522 
2017 8323 1685 3006 3441 


Table 7.1 gives four-digit random numbers arranged in 20 rows and five 
columns. These random numbers can be generated by a computer programmed 
to scramble numbers. The logic for generating random number is that any 
number can be constructed from numbers 0 to 9. The probability that any 
one digit from 0 through 9 will appear is the same as that for any other digit 
and the appearance of the numbers is statistically independent. Further, the 
probability of one sequence of digits occurring is the same as that for any 
other sequence of the same length. 


The use of random number table for selecting samples could be 
illustrated through an example. Suppose there are 75 students in a class 
and it is decided to select 15 out of the 75 students. These students can be 
numbered from 01 to 75. Now, to pick up 15 students using random numbers 
and following the scheme of simple random sampling with replacement, we 
proceed as follows: 


e With eyes closed, we place our finger on a number on the random 
number table. Suppose it is on the first row and the first column of our 
table. Now, we go down the first two columns and choose two-digit 
random numbers running from 01 to 75. If any number greater than 75 
appears, it gets rejected. This way, the first number to be selected would 
be 28. The second number is 80, which would be rejected as we are 
choosing numbers from 01 to 75. The next selected number would be 
13, followed by 08, 23, 48, 34, 59, 44, 49, 74, 40, 65, 70 and 65. Note | cic instructional 
that 65 has appeared twice. Since we are using the scheme of simple Material 93 
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random sampling with replacement, we would retain it. This way we 
have selected 14 samples. The 15th number selected would be 20. In 
brief, the scheme explained above states that any number greater than 
the population size (in this case 75) is rejected and only the numbers 
from 01 to 75 are selected. Anumber may get repeated because simple 
random sampling scheme is done with replacement. 


Simple random sampling without replacement 


In the case of simple random sample without replacement, the procedure is 
identical to what was explained in the case of simple random sampling with 
replacement. The only difference here is that the chosen slip is not placed back 
in the box. This way, the first unit would be selected with the probability of 
1/1000, second unit with the probability of 1/999, the third will be selected 
with a probability of 1/998 and so on, till we select the required number of 
elements (in this case, 15) in our sample. 


The simple random sampling (with or without replacement) is not used 
in aconsumer research. This is because in a consumer research the population 
size is usually very large, which creates problems in the preparation of a 
sampling frame. For example, there is a large number of consumers of soft 
drinks, pizza, shampoo, soap, chocolate and so on. However, these (SRSWR 
and SRSWOR) designs could be useful when the population size is very small, 
for example, the number of steel/aluminium-producing companies in India 
and the number of banks in India. Since the population size is quite small, 
the preparation of a sampling frame does not create any problem. 


Another problem with these (SRSWR and SRSWOR) designs is that 
we may not get a representative sample using such a scheme. Consider an 
example of a locality having 10,000 households, out of which 5,000 belong to 
low-income group, 3,500 belong to middle income group and the remaining 
1,500 belong to high-income group. Suppose it is decided to take a sample 
of 100 households using the simple random sampling. The selected sample 
may not contain even a single household belonging to the high- and middle- 
income group and only the low-income households may get selected, thus, 
resulting in a non-representative sample. 


Systematic sampling 


Systematic sampling takes care of the limitation of the simple random 
sampling that the sample may not be a representative one. In this design, 
the entire population is arranged in a particular order. The order could be 
the calendar dates or the elements of a population arranged in an ascending 
or a descending order of the magnitude which may be assumed as random. 
List of subjects arranged in the alphabetical order could also be used and 
they are usually assumed to be random in order. Once this is done, the steps 
followed in the systematic sampling design are as follows: 


e First of all, a sampling interval given by K = N/nis calculated, where Probability and Non-Prob- 
bility Sampling Methods, 

N = the size of the population and n = the size of the sample. It is ee pied: ine 
seen that the sampling interval K should be an integer. If it is not, itis and Non-sampling Errors 


rounded off to make it an integer. 


NOTE 
e A random number is selected from 1 to K. Let us call it C. OERS 


e The first element to be selected from the ordered population would be 
C, the next element would be C + K and the subsequent one would be 
C + 2K and so on till a sample of size n is selected. 


This way we can get representation from all the classes in the population 
and overcome the limitations of the simple random sampling. To take an 
example, assume that there are 1,000 grocery shops in a small town. These 
shops could be arranged in an ascending order of their sales, with the first 
shop having the smallest sales and the last shop having the highest sales. If 
it is decided to take a sample of 50 shops, then our sampling interval K will 
be equal to 1000 + 50 = 20. Now we select a random number from 1 to 20. 
Suppose the chosen number is 10. This means that the shop number 10 will 
be selected first and then shop number 


10 + 20 = 30 and the next one would be 10 + 2 x 20 = 50 and so on till 
all the 50 shops are selected. This way we can get a representative sample 
in the sense that it will contain small, medium and large shops. 


It may be noted that in a systematic sampling the first unit of the 
sample is selected at random (probability sampling design) and having chosen 
this, we have no control over the subsequent units of sample (non-probability 
sampling). Because of this, this design at times is called mixed sampling. 


The main advantage of systematic sampling design is its simplicity. 
When sampling from a list of population arranged in a particular order, one 
can easily choose a random start as described earlier. After having chosen 
a random start, every K” item can be selected instead of going for a simple 
random selection. This design is statistically more efficient than a simple 
random sampling, provided the condition of ordering of the population is 
satisfied. 


The use of systematic sampling is quite common as it is easy and cheap 
to select a systematic sample. In systematic sampling one does not have to 
jump back and forth all over the sampling frame wherever random number 
leads, and neither does one have to check for duplication of elements as 
compared to simple random sampling. Another advantage of a systematic 
sampling over simple random sampling is that one does not require a 
complete sampling frame to draw a systematic sample. The investigator may 
be instructed to interview every 10th customer entering a mall without a list 
of all customers. 


There may be situations where it may not be possible to get a 


representative sample. The design can create problems if the sampling Self-Instructional 
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interval is a whole number multiple of some cycle related to the problem. 
On this design there may be a problem that there is a high probability of 
systematic bias creeping into the sample resulting in a non-representative 
sample. Consider, for example, the case of a certain PVR cinema hall where 
there may be a couple of snack bars. We may be interested in estimating the 
average daily sales of a particular snack bar in that PVR. Now, using the daily 
data with the population and sample size known, we compute a sampling 
interval which may be a multiple of seven. Using this, we may select our 
first element which would reflect one of the seven days of the week, say 
Friday. The next element would also be Friday, as our sampling interval 
is a multiple of seven and so the subsequent elements of the population. 
Therefore, our sample would comprise only Fridays and the sample would 
not reflect day of the week variation in the sales data, which could result in 
a non-representative sample. Therefore, while using daily data, care should 
be taken that our sampling interval is not a multiple of seven. 


Stratified random sampling 


Under this sampling design, the entire population (universe) is divided into 
strata (groups), which are mutually exclusive and collectively exhaustive. 
By mutually exclusive, it is meant that if an element belongs to one stratum, 
it cannot belong to any other stratum. Strata are collectively exhaustive if all 
the elements of various strata put together completely cover all the elements 
of the population. The elements are selected using a simple random sampling 
independently from each group. 


There are two reasons for using a stratified random sampling rather 
than simple random sampling. One is that the researchers are often interested 
in obtaining data about the component parts of a universe. For example, 
the researcher may be interested in knowing the average monthly sales of 
cell phones in ‘large’, ‘medium’ and ‘small’ stores. In such a case, separate 
sampling from within each stratum would be called for. The second reason 
for using a stratified random sampling is that it is more efficient as compared 
to a simple random sampling. This is because dividing the population into 
various strata increases the representativness of the sampling as the elements 
of each stratum are homogeneous to each other. 


There are certain issues that may be of interest while setting up a 
stratified random sample. These are: 


What criteria should be used for stratifying the universe (population)? 


The criteria for stratification should be related to the objectives of the study. 
The entire population should be stratified in such a way that the elements 
are homogeneous within the strata, whereas there should be heterogeneity 
between strata. As an example, if the interest is to estimate the expenditure of 
households on entertainment, the appropriate criteria for stratification would 
be the household income. This is because the expenditure on entertainment 
and household income are highly correlated. As another example, if the 


objective of the study is to estimate the amount of money spent on cosmetics, Probability and Non-Prob- 
then, gender could be used as an appropriate criteria for stratification. This eas a ie 
is because it is known that though both men and women use cosmetics, the and Non-sampling Errors 
expenditure by women is much more than that of their male counterparts. 
Someone may argue out that gender may no longer remain the appropriate 
criteria if it is not backed by income. Therefore, the researcher might have 
to use two or more criteria for stratification depending upon the problem 
in hand. This would only increase the number of strata thereby making the 


sampling difficult. 


NOTES 


Generally stratification is done on the basis of demographic variables 
like age, income, education and gender. Customers are usually stratified 
on the basis of life stages and income levels to study their buying patterns. 
Companies may be stratified according to size, industry, profits for analysing 
the stock market reactions. 


How many strata should be constructed? 


Going by common sense, as many strata as possible should be used so that 
the elements of each stratum will be as homogeneous as possible. However, 
it may not be practical to increase the number of strata and, therefore, the 
number may have to be limited. Too many strata may complicate the survey 
and make preparation and tabulation difficult. Costs of adding more strata 
may be more than the benefit obtained. Further, the researcher may end up 
the practical difficulty of preparing a separate sampling frame as the simple 
random samples are to be drawn from each stratum. 


What should be appropriate number of samples size to be taken in each 
stratum? 


This question pertains to the number of observations to be taken out from 
each stratum. At the outset, one needs to determine the total sample size for 
the universe and then allocate it between each stratum. This may be explained 
as follows: 


Let there be a population of size N. Let this population be divided into 
three strata based on a certain criterion. Let N,, N, and N, denote the size of 
strata 1, 2 and 3 respectively, such that N = N, k N, + N.. These strata are 
mutually exclusive and collectively exhaustive. Fach of these three strata 
could be treated as three populations. Now, ifa total sample of size n is to be 
taken from the population, the question arises that how much of the sample 
should be taken from strata 1, 2 and 3 respectively, so that the sum total of 
sample sizes from each strata adds up to n. 


Let the size of the sample from first, second and third strata be n,, 
and n, respectively such that n =n, + n, + n,. Then, there are two ne 
that may be used to determine the values of ni, (i= 1, 2, 3) from each strata. 
These are proportionate and disproportionate allocation schemes. 


Proportionate allocation scheme: In this scheme, the size of the Self-Instructional 
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strata. As an example, if a bank wants to conduct a survey to understand the 
problems that its customers are facing, it may be appropriate to divide them 
into three strata based upon the size of their deposits with the bank. If we 
have 10,000 customers of a bank in such a way that 1,500 of them are big 
account holders (having deposits more than ~ 10 lakh), 3,500 of them are 
medium sized account holders (having deposits of more than Ẹ 2 lakh but 
less than Ẹ 10 lakh), the remaining 5,000 are small account holders (having 
deposits of less than ¥ 2 lakh). Suppose the total budget for sampling is fixed 
at ~ 20,000 and the cost of sampling a unit (customer) is 20. If a sample 
of 100 is to be chosen from all the three strata, the size of the sample from 
strata 1 would be: 


nı a Ea 
N 10000 


15 


The size of sample from strata 3 would be: 
3500 


N 
n, =nx— =100x—— =35 
N 10000 


The size of sample from strata 3 would be: 


Ns L joox 5000. 
N 10000 


50 


n, =nX 


This way the size of the sample chosen from each stratum is proportional 
to the size of the stratum. Once we have determined the sample size from each 
stratum, one may use the simple random sampling or the systematic sampling 
or any other sampling design to take out samples from each of the strata. 


Disproportionate allocation: As per the proportionate allocation explained 
above, the sizes of the samples from strata 1, 2 and 3 are 15, 35 and 50 
respectively. As it is known that the cost of sampling of a unit is Ẹ 20 
irrespective of the strata from where the sample is drawn, the bank would 
naturally be more interested in drawing a large sample from stratum 1, which 
has the big customers, as it gets most of its business from strata 1. In other 
words, the bank may follow a disproportionate allocation of sample as the 
importance of each stratum is not the same from the point of view of the 
bank. The bank may like to take a sample of 45 from strata 1 and 40 and 15 
from strata 2 and 3 respectively. Also, a large sample may be desired from 
the strata having more variability. 


Cluster sampling 


In the cluster sampling, the entire population is divided into various clusters in 
such a way that the elements within the clusters are heterogeneous. However, 
there is homogeneity between the clusters. This design, therefore, is just the 
opposite of the stratified sampling design, where there was homogeneity 
within the strata and heterogeneity between the strata. To illustrate the 
example of a cluster sampling, one may assume that there is a company 


having its corporate office in a multi-storey building. In the first floor, we Probability and Non-Prob- 

3 š ability Sampling Methods, 
may assume that there is a marketing department where the offices of the Sample Size and Sampling 
president (marketing), vice president (marketing) and so on to the level of and Non-sampling Errors 
management trainee (marketing) are there. Naturally, there would be a lot 
of variation (heterogeneity) in the amount of salaries they draw and hence 
a high amount of variation in the amount of money spent on entertainment. 
Similarly, if the finance department is housed on the second floor, we may find 
almost a similar pattern. Same could be assumed for third, fourth and other 
floors. Now, if each of the floors could be treated as a cluster, we find that 
there is homogeneity between the clusters but there is a lot of heterogeneity 
within the clusters. Now, a sample of, say, 2 to 3 clusters is chosen at random 
and once having done so, each of the cluster is enumerated completely to 
be able to make an estimate of the amount of money the entire population 
spends on entertainment. 


NOTES 


Examples of cluster sampling could include ad-hoc organizational 
committees drawn from various departments to advise the CEO of a company 
on product development, new product ideas, evaluating alternative advertising 
programmes, budget allocations and marketing strategies. Each of the clusters 
comprises a heterogeneous collection of members with different interests, 
background, experience, value system and philosophy. The CEO of the 
company may be able to take strategic decisions based upon their combined 
advice. 


Although the per unit costs of cluster sampling are much lower than 
those of other probability sampling, the applicability of cluster sampling to 
an organizational context may be questioned as a cluster may not contain 
heterogeneous elements. The condition of heterogeneity within the cluster 
and homogeneity between the clusters may not be met. As another example, 
the households in a block are to be similar rather than dissimilar and as a 
result, it may be difficult to form heterogeneous clusters. 


Although the per unit costs of cluster sampling are much lower than 
those of other probability sampling, the applicability of cluster sampling to 
an organizational context may be questioned as a cluster may not contain 
heterogeneous elements. The condition of heterogeneity within the cluster 
and homogeneity between the clusters may not be met. As another example, 
the households in a block are to be similar rather than dissimilar and as a 
result, it may be difficult to form heterogeneous clusters. 


Cluster sampling is useful when populations under a survey are widely 
dispersed and drawing a simple random sample may be impractical. 


Non-probability Sampling Designs 


Under the non-probability sampling, the following designs would be 
considered—convenience sampling, purposive sampling, snowball sampling 
and quota sampling. 
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Convenience sampling 


Convenience sampling is used to obtain information quickly and 
inexpensively. The only criterion for selecting sampling units in this scheme is 
the convenience of the researcher or the investigator. Mostly, the convenience 
samples used are neighbours, friends, family members, colleagues and 
‘passers-by’. This sampling design is often used in the pre-test phase of 
a research study such as the pre-testing of a questionnaire. Some of the 
examples of convenience sampling are: 


e People interviewed in a shopping centre for their political opinion for 
a TV programme. 


e Monitoring the price level in a grocery shop with the objective of 
inferring the trends in inflation in the economy. 


e Requesting people to volunteer to test products. 


e Using students or employees of an organization for conducting an 
experiment. 


e Interviews conducted by a TV channel of people coming out of a 
cinema hall, to seek their opinion about the movie. 


e A researcher visiting a few shops near his residence to observe which 
brand of a particular product people are buying, so as to draw a rough 
estimate of the market share of the brand. 


In all the above situations, the sampling unit may either be self-selected 
or selected because of ease of availability. No effort is made to choose a 
representative sample. Therefore, in this design the difference between the 
population value (parameters) of interest and the sample value (statistic) is 
unknown both in terms of the magnitude and direction. Therefore, it is not 
possible to make an estimate of the sampling error and researchers won’t be 
able to make a conclusive statement about the results from such a sample. It 
is because of this, convenience sampling should not be used in conclusive 
research (descriptive and causal research). 


Convenience sampling is commonly used in exploratory research. This 
is because the purpose of an exploratory research is to gain an insight into 
the problem and generate a set of hypotheses which could be tested with the 
help of a conclusive research. When very little is known about a subject, a 
small-scale convenience sampling can be of use in the exploratory work to 
help understand the range of variability of responses in a subject area. 


Judgemental sampling 


Under judgemental sampling, experts in a particular field choose what 
they believe to be the best sample for the study in question. The judgement 
sampling calls for special efforts to locate and gain access to the individuals 
who have the required information. Here, the judgement of an expert is used 


to identify a representative sample. For example, the shoppers ata shopping Probability and Non-Prob- 
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centre may serve to represent the residents of a city or some of the cities Sample size and Sampling 
may be selected to represent a country. Judgemental sampling design is used and Non-sampling Errors 
when the required information is possessed by a limited number/category of 
people. This approach may not empirically produce satisfactory results and, 
may, therefore, curtail generalizability of the findings due to the fact that we 
are using a sample of experts (respondents) that are usually conveniently 
available to us. Further, there is no objective way to evaluate the precision of 
the results. A company wanting to launch a new product may use judgemental 
sampling for selecting ‘experts’ who have prior knowledge or experience 
of similar products. A focus group of such experts may be conducted to get 
valuable insights. Opinion leaders who are knowledgeable are included in 
the organizational context. Enlightened opinions (views and knowledge) 
constitute a rich data source. A very special effort is needed to locate and 
have access to individuals who possess the required information. 


NOTES 


The most common application of judgemental sampling is in business- 
to-business (B to B) marketing. Here, a very small sample of lead users, key 
accounts or technologically sophisticated firms or individuals is regularly 
used to test new product concepts, producing programmes, etc. 


Snowball sampling 


Snowball sampling is generally used when it is difficult to identify the 
members of the desired population, e.g., deep-sea divers, families with triplets, 
people using walking sticks, doctors specializing in a particular ailment, 
etc. Under this design each respondent, after being interviewed, is asked to 
identify one or more in the field. This could result in a very useful sample. 
The main problem is in making the initial contact. Once this is done, these 
cases identify more members of the population, who then identify further 
members and so on. It may be difficult to get a representative sample. One 
plausible reason for this could be that the initial respondents may identify 
other potential respondents who are similar to themselves. The next problem 
is to identify new cases. 


Quota sampling 


In quota sampling, the sample includes a minimum number from each 
specified subgroup in the population. The sample is selected on the basis 
of certain demographic characteristics such as age, gender, occupation, 
education, income, etc. The investigator is asked to choose a sample that 
conforms to these parameters. Field workers are assigned quotas of the sample 
to be selected satisfying these characteristics. 


A researcher wants to measure the job satisfaction level among the 
employees of a large organization and believes that the job satisfaction level 


varies across different types of employees. The organization is having 10 per 
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and Non-sampling Errors from the organization, then 20, 30, 70 and 80 employees from class I, class 
II, class III and class IV respectively should be selected from the population. 
NOTES Now, various investigators may be assigned quotas from each class in such 
a way that a sample of 200 employees is selected from various classes in the 
same proportion as mentioned in the population. For example, the first field 
worker may be assigned a quota of 10 employees from class I, 15 from class 
II, 20 from class HI and 30 from class IV. Similarly, a second investigator 
may be assigned a different quota such that a total sample of 200 is selected 
in the same proportion as the population is distributed. Please note that the 
investigators may choose the employees from each class as conveniently 
available to them. Therefore, the sample may not be totally representative 
of the population, hence the findings of the research cannot be generalized. 
However, the reason for choosing this sampling design is the convenience 
it offers in terms of effort, cost and time. 


In the example given above, it may be argued that job satisfaction is 
also influenced by education level, categorized as higher secondary or below, 
graduation, and postgraduation and above. By incorporating this variable, the 
distribution of population may look as given in Table 7.2. From the table, we 
may note that there are 8 per cent class I employees who are postgraduate 
and above, there are 35 per cent class IV employees with a higher secondary 
education and below and so on. Now, suppose a sample of size 200 is 
again proposed. In this case, the distribution of sample satisfying these two 
conditions in the same proportion in the population is given in Table 7.3. 


Table 7.2 Distribution of Population (percentage) 


Sees i ee sed 


crwn | o | o| o| Cd 
ioner Secondary andteiow | o0 | o | 10 | s | «5 | 


Table 7.3 Distribution of Sample (numbers) 


ee 
ea 
| Postgraduation and above | 16 | 10 | 1 | o | 36 | 


[Graduation a | o | o a 
[Higher Secondaryandbelow | o | o | 20 | 70 | œ] 
Toa a | 70 |__| _200_| 


Table 7.3 indicates that a sample of 20 class II employees who are 
Self-Instructional graduates should be selected. Likewise, a sample of 10 employees who 
102 Material possess postgraduate and above education should be selected. In the above 


table, the sample to be taken from each of the 12 cells has been specified. Probability and Non-Prob- 
Havi d h of the i tigat : : d ta t llect ability Sampling Methods, 
Having done so, each of the investigators is assigned a quota to collect Sample size and Sampling 
information from the employees conforming to the above norms so that a and Non-sampling Errors 


sample of 200 is selected. 


eae ; NOTE 
Quota sampling design may look similar to the stratified random ane 


sampling design. However, there are differences between the two. In the 
stratified sampling design, the selection of sample from each stratum is 
random but in the quota sampling, the respondents may be chosen at the 
convenience or judgement of the researchers. Further, as already stated, the 
results of stratified random sampling could be generalized, whereas it may 
not be possible in the case of quota sampling. Quota sampling has some 
advantages over the probabilistic techniques. This design is very economical 
and it does not take too much time to set it up. Also, the use of this design 
does not require a sampling frame. 


However, quota sampling also has certain weaknesses like: 


e The total number of cells depends upon the number of control 
characteristics associated with the objectives of the study. If the 
control characteristics are large, the total number of cells increases, 
which may result in making the task of the investigator difficult. 


e The chosen control characteristics should be related to the objectives 
of the study. The findings of the study could be misleading if any 
relevant parameter is omitted for one reason or the other. 


e The investigator may visit those places where the chances of getting 
the respondents with the required control characteristics are high. 
The investigator could also avoid some responses that appear to be 
unfriendly. All this could result in making the findings of the study 
less reliable. 


7.3 SAMPLE SIZE DETERMINATION, 
CALCULATION AND FACTORS AFFECTING 
THE SIZE OF THE SAMPLE 


The size of a sample depends upon the basic characteristics of the population, 
the type of information required from the survey and the cost involved. 
Therefore, a sample may vary in size for several reasons. The size of the 
population does not influence the size of the sample. 


There are various methods of determining the sample size in practice: 


e Researchers may arbitrary decide the size of sample without giving 
any explicit consideration to the accuracy of the sample results or 
the cost of sampling. This arbitrary approach should be avoided. 


e Forsome of the projects, the total budget for the field survey (usually 
x í x r : Self-Instructional 
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per sample unit is known, one can easily obtain the sample size 
by dividing the total budget allocation by the cost of sampling per 
unit. This method concentrates only on the cost aspect of sampling, 
rather than the value of information obtained from such a sample. 


e There are other researchers who decide on the sample size based on 
what was done by the other researchers in similar studies. Again, this 
approach cannot be a substitute for the formal scientific approach. 


e The most commonly used approach for determining the size of 
sample is the confidence interval approach covered under inferential 
statistics. Below will be discussed this approach while determining 
the size of a sample for estimating population mean and population 
proportion. In a confidence interval approach, the following points 
are taken into account for determining the sample size in estimation 
of problems involving means: 


(a) The variability of the population: It would be seen that the higher 
the variability as measured by the population standard deviation, 
larger will be the size of the sample. If the standard deviation of 
the population is unknown, a researcher may use the estimates of 
the standard deviation from previous studies. Alternatively, the 
estimates of the population standard deviation can be computed 
from the sample data. 


(b) The confidence attached to the estimate: It is a matter of judgement, 
how much confidence you want to attach to your estimate. 
Assuming a normal distribution, the higher the confidence the 
researcher wants for the estimate, larger will be sample size. This 
is because the value of the standard normal ordinate ‘Z’ will vary 
accordingly. For a 90 per cent confidence, the value of ‘Z’ would 
be 1.645 and for a 95 per cent confidence, the corresponding ‘Z’ 
value would be 1.96 and so on (see Appendix 1 at the end of the 
book). It would be seen later that a higher confidence would lead 
to a larger ‘Z’ value. 


(c) The allowable error or margin of error: How accurate do we 
want our estimate to be is again a matter of judgement of the 
researcher. It will of course depend upon the objectives of the 
study and the consequence resulting from the higher inaccuracy. 
If the researcher seeks greater precision, the resulting sample size 
would be large. 


Sample size for estimating population mean 


We have learnt in the central limit theorem that the sampling distribution of 
the sample mean follows a normal distribution with a mean F and a standard 
error irrespective of the shape of population distribution whenever the sample 
size is large. Symbolically, it may be written as: 


XON(H,¥) 


n x 30 


The above also holds true whenever samples are drawn from normal 
population. However, in that case, the requirement of a large sample is not 
there. The various notations are explained as under: 


= Sample mean 


= Population mean 


= Standard error of mean 


= Sample size 


Z p lap xi 


= Population size 


Q 
ll 


Population standard deviation 


The value of: 


oO 


= o/vn (when samples are drawn from an infinite population) 


xja 


N- z ; 
= 2 > when samples are drawn from a finite population 
Jn N N-1 p pop 


The expression Z= is called the finite population multiplier and 


need not be used while sampling from a finite population provided wi < 0.05. 


The standard normal variate Z may be written as: 


X-u 
mEn 
X 
= X-u 
OS 
vn 
Zs Ee 
oO 
evn 
z of 
o 
Where X-u = e = Margin of error 
Zz’? 
n=-3 


e 
It may be noted from above that the size of the sample is directly 


proportional to the variability in the population and the value of Z for a 
confidence interval. It varies inversely with the size of the error. It may also 
be noted that the size of a sample does not depend upon the size of population. 
Below are given some worked out examples for the determination of a sample 
size. 
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Example 7.1: An economist is interested in estimating the average monthly 
household expenditure on food items by the households of a town. Based on 
past data, it is estimated that the standard deviation of the population on the 
monthly expenditure on food item is 7 30. With allowable error set at Ẹ 7, 
estimate the sample size required at a 90 per cent confidence. 


Solution: 


90 per cent confidence > Z = 1.645 


e= %7 
o = ~30 
Z720? 
n = e? 
(1.645) (30)? 
~ W 
= 49.7025 
= 50 (approx.) 


Example 7.2: You are given a population with a standard deviation of 8.6. 
Determine the sample size needed to estimate the mean of the population 
within + 0.5 with a 99 per cent confidence. 


Solution: 
99 per cent confidence > Z = 2.575 
e = +05 
8.6 
Zo’ 
n= 2 
(2.575) (8.6) 
(0.5)? 
1961.60 
1962 (approx.) 


Example 7.3: It is desired to estimate the mean life time of a certain kind of 
vacuum cleaner. Given that the population standard deviation o = 320 days, 
how large a sample is needed to be able to assert with a confidence level of 
96 per cent that the mean of the sample will differ from the population mean 
by less than 45 days? 


Solution: 


(07 


96 per cent confidence > Z = 2.055 
e = 45 
o = 320 


Z720? 
n = e? 
(2.055)? (320)? 
(45) 
213.55 


= 214 (approx.) 


ll 


Determination of sample size for estimating the population proportion 


If the sample proportion P is used to estimate the population proportion p, 


the standard error of (| would be ,/P1, where q = 1 — p. Now assuming 
n 


normal distribution, we have 
PAN [> ea) 
n 


Therefore, Z= Ipq 
n 
: — ._- |Pq 
Therefore, margin of error e = P~P= aĵa 
e 
pq 


n 


L= 


evn 

Z= “= 

va 

Z*pq 

n= = 
The above formula will be used if the value of population proportion 
p is known. If, however, p is unknown, we substitute the maximum value of 
pq in the above formula. It can be shown that the maximum value of pq is 

⁄4 when p = 4 and q = '4. This is shown in Figure 7.1. 
2 


Therefore, n= Ae 


Let us consider a few examples for determining a sample size while 
estimating the population proportion. 


Example 7.4: A market researcher for a consumer electronics company would 
like to study the television viewing habits of the residents of a particular, small 
city. What sample size is needed if he wishes to be 95 per cent confident of 
being within + 0.035 of the true proportion who watch the evening news on 
at least three weeknights if no previous estimate is available? 
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Fig. 7.1 Graph of pq Corresponding to the Values of p 
Solution: 


90 per cent confidence => Z = 1.96 
e = +0.035 


1 (1.96)? 
_ 4 (0.035)? 


=784 


Example 7.5: A manager ofa department store would like to study women’s 
spending per year on cosmetics. He is interested in knowing the population 
proportion of women who purchase their cosmetics primarily from his store. 
Ifhe wants to have a 90 per cent confidence of estimating the true proportion 
to be within + 0.045, what sample size is needed? 

90 per cent confidence > Z = 1.645 


e = 0.045 
n= n=-— 


1 (1.645)? 
4 (0.045)? 

= 334.0722 

= 335 (approx.) 
Example 7.6: A consumer electronics company wants to determine the job 
satisfaction levels of its employees. For this, they ask a simple question, 
“Are you satisfied with your job?’ It was estimated that no more than 30 per 
cent of the employees would answer yes. What should be the sample size for 
this company to estimate the population proportion to ensure a 95 per cent 
confidence in result, and to be within 0.04 of the true population proportion? 


Solution: Probability and Non-Prob- 
ability Sampling Methods, 

95 per cent confidence > Z = 1.96 Sample Size and Sampling 

and Non-sampling Errors 


e = 0.04 
p= 0.3 NOTES 
q = 0.7 
Z*pq 
n= © 
(1.96)? x 0.3 x 0.7 
_ (0.04)? 
= 504.21 


ll 


505 (approx.) 
Factors to be noted for sample size determination 


There are certain issues to be kept in mind before applying the formulas for 
the determination of sample size in this unit. First of all, these formulas are 
applicable for simple random sampling only. Further, they relate to the sample 
size needed for the estimation of a particular characteristic of interest. In a 
survey, a researcher needs to estimate several characteristics of interests and 
each one of them may require a different sample size. In case the universe is 
divided into different strata, the accuracy required for determining the sample 
size for each strata may be different. However, the present method will not 
able to serve the requirement. Lastly, the formulas for sample size must be 
based upon adequate information about the universe. 


7.4 SAMPLING AND NON-SAMPLING ERRORS 


There are two types of error that may occur while we are trying to estimate 
the population parameters from the sample. These are called sampling and 
non-sampling errors. 


I. Sampling Error 


This error arises when a sample is not representative of the population. For 

example, if our population comprises 200 MBA students in a business school 

and we want to estimate the average height of these 200 students by taking a 

sample of 10 (say). Let us assume for the sake of simplicity that the true value 

of population mean (parameter) is known. When we estimate the average 

height of the sampled students, we may find that the sample mean is far away 

from the population mean. The difference between the sample mean and the 

population mean is called sampling error, and this could arise because the 

sample of 10 students may not be representative of the entire population. ERE 
Suppose now we increase the sample size from 10 to 15, we may find that material 109 
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the sampling error reduces. This way, if we keep doing so, we may note that 
the sampling error reduces with the increase in sample size as an increased 
sample may result in increasing the representativeness of the sample. 


Reducing the sampling errors 


The following methods are useful in reducing sampling errors: 


(i) Increase in the size of the sample: As already mentioned, the 
sampling error can be reduced by increasing the sample size. If the 
sample size equals the population size, then the sampling error is zero. 


(ii) Stratification: A simple random sample is likely to be representative 
of the population if the population contains homogeneous units. 
However, when the population consists of dissimilar units, a simple 
random sample may not be representative of all types of units in the 
population. In order to improve the result of the sample, the sample 
design is modified. The population is divided into different groups 
comprising similar units. These groups are referred to as strata. From 
each group (stratum), a sub-sample is selected in a random manner. 
As a result, all the groups have representation in the sample, reducing 
the sampling error. It is known as stratified-random sampling. 


II. Non-Sampling Error 


This error arises not because a sample is not a representative of the population 
but because of other reasons. Some of these reasons are listed below: 


e The respondents when asked for information on a particular variable 
may not give the correct answers. If a person aged 48 is asked a question 
about his age, he may indicate the age to be 36, which may result in 
an error and in estimating the true value of the variable of interest. 


e The error can arise while transferring the data from the questionnaire 
to the spreadsheet on the computer. 


e There can be errors at the time of coding, tabulation and computation. 


e If the population of the study is not properly defined, it could lead to 
errors. 


e The chosen respondent may not be available to answer the questions 
or may refuse to be part of the study. 


e There may be a sampling frame error. Suppose the population comprises 
households with low income, high income and middle class category. 
The researcher might decide to ignore the low-income category 
respondents and may take the sample only from the middle and the 
high-income category people. 


Reducing non-sampling errors 


Non-sampling errors are a fraction of the total error arising from performing 
a statistical analysis. The balance of the total error arises from sampling 


error. Unlike sampling error, increase in the sample size does not reduce Probability and Non-Prob- 
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the following ways: 


; NOTE 
e Improving survey processes: Non-sampling errors can be reduced by Orne 


improving survey processes. The survey should rely on the collection 
and analysis of data relating to editing performance as well as to the 
sources, types and distribution of errors in the data. 


e Systematic and orderly specification of edits: For mitigation of 
non-sampling errors, there is a need for the systematic and orderly 
specification of edits and that the amendment of data should occur 
only in response to important errors. A balance should essentially be 
achieved between: 


o Edits applied in the field and those applied in the office 


o Automated and clerical approaches to verification and amendment 
of errors 


o Use of micro and macro-editing methods 


e Use of computer-assisted telephone interviewing (CATI) and 
computer-assisted personal interview (CAPI) systems: These 
systems allow for greater control than the normal paper questionnaire 
does. There is less leeway for interviewers to commit mistakes. Further, 
the integration of data collection, data entry and editing brings down the 
chances of errors in these systems. These systems have the technology 
to provide immediate feedback of possible errors. This helps the 
interviewer query during the interview. 


e Use of computer-assisted coding systems and automated coding 
systems: These coding systems have the potential for more accurate and 
less expensive coding systems. Further, they result in more consistent 
coding. 


7.4.1 Biased Sample 


The purpose of research is to estimate an unbiased statistic that represents 
the true parameter for a population. Bias is defined as a predisposition to one 
particular outcome over another. Bias in research leads to unrepresentative 
outcomes. In other words, bias causes researchers’ estimates to be predisposed 
to the left or to the right of the true mark. 


Statistical estimates rely on random error, and bias introduces 
systematic error into the research design or analysis, rendering outcomes 
unreliable or meaningless. Random error is unpredictable, while systematic 
error is predictable. Obtaining a biased estimate can be the result of one or 
more mistakes made before, during, or after a study. 


A sample that is not representative of the population from which it was 
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Check Your Progress 

1. What does a size of a sample depend upon? 

2. Define sampling error. 

3. What does bias in research lead to? 

4. What is probability sampling design? 

5. Define systematic sampling. 

6. What is the importance of cluster sampling. 

7.5 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 

1. The size of a sample depends upon the basic characteristics of the 
population, the type of information required from the survey and the 
cost involved. 

2. The difference between the sample mean and the population mean is 
called sampling error. 

3. Bias in research leads to unrepresentative outcomes. In other words, 
bias causes researchers’ estimates to be predisposed to the left or to 
the right of the true mark. 

4. Under probability sampling design, simple random sampling will be 
replaced and sampling designs would be covered. 

5. Systematic sampling takes care of the limitation of the simple random 
sampling that the sample may not be a representative one. 

6. When populations under a survey are widely dispersed and drawing 
a simple random sample is required, cluster sampling comes to the 
rescue. 

7.6 SUMMARY 


The size of a sample depends upon the basic characteristics of the 
population, the type of information required from the survey and the 
cost involved. Therefore, a sample may vary in size for several reasons. 
Researchers may arbitrary decide the size of sample without giving 
any explicit consideration to the accuracy of the sample results or the 
cost of sampling. 


In a survey, a researcher needs to estimate several characteristics of 
interests and each one of them may require a different sample size. 


e There are two types of error that may occur while we are trying to peters he a 
estimate the population parameters from the sample. These are called $77 "emp ing fenos 
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population. 


e Anon-sampling error arises not because a sample is not a representative 
of the population but because of other reasons. 


e Non-sampling errors are a fraction of the total error arising from 
performing a statistical analysis. The balance of the total error arises 
from sampling error. 


e Unlike sampling error, increase in the sample size does not reduce 
non-sampling error. 


e The purpose of research is to estimate an unbiased statistic that 
represents the true parameter for a population. 


e Asample that is not representative of the population from which it was 
drawn is called a biased sample. 


e The process of selecting samples from a population is termed as 
sampling design. Two types of sampling designs are probability 
sampling design and non-probability sampling design. 


In the proportionate allocation scheme, the size of the sample in each 
stratum is proportional to the size of the population of the strata. 


e Incluster sampling, the entire population is divided into various clusters 
in such a way that the elements within the clusters are heterogeneous. 


e Convenience sampling is used to obtain information quickly and 
inexpensively. The only criterion for selecting sampling units in this 
scheme is the convenience of the researcher or the investigator. 


Snowball sampling is generally used when it is difficult to identify the 
members of the desired population, e.g., deep-sea divers, families with 
triplets, people using walking sticks, doctors specializing in a particular 
ailment, etc. 


e In quota sampling, the sample includes a minimum number from each 
specified subgroup in the population. The sample is selected on the basis 
of certain demographic characteristics such as age, gender, occupation, 
education, income, etc. 


7.7 KEY WORDS 


e Field Survey: It is a type of field research by which archaeologists 
search for archaeological sites and collect information about the 
location, distribution and organization of past human cultures across 


a large area. Self-Instructional 
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not due to sampling. 


eo e Bias: It is disproportionate weight in favour of or against one thing, 


person, or group compared with another, usually in a way considered 
to be unfair. 


7.8 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 
1. What are the points that need to be taken into consideration while 
determining sample size? 
2. How do we reduce non-sampling errors? 


3. Write a short-note on biased sample. 
Long-Answer Questions 


. Describe the methods for determining sample size. 

. What is sampling error? How can researchers reduce sampling errors? 
. Illustrate how to determine sample size for estimating population mean. 
. What are the methods of sampling? Discuss any two in detail. 


. Explain cluster sampling. 


NH nn BW NY 


. Analyse judgemental sampling. 
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8.0 INTRODUCTION 


In this unit, you will be introduced to the process of data collection. Primary 
data can be obtained through observations or through direct communication 
with the persons associated with the selected subject by performing surveys 
or descriptive research. A telephonic interview is also usually limited to two 
persons. However, it is conducted over the telephone. Telephonic interviews 
are generally considered as the initial methods for screening the candidates for 
personal interviews. Observation methods can be categorized into different 
types depending on various factors such as style for recording the observed 
information, data needed for observation and activity of the observer. The 
information or the questions included in the schedule should be accurate and 
should enable the respondent to better understand the context in which the 
questions are asked. 


8.1 OBJECTIVES 


After going through this unit, you will be able to; 
e Discuss the different sources of data 
e Describe the types and methods of observation 


e Explain the types and limitations of interview 
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Data-I 8.2 SOURCES OF DATA: PRIMARY DATA AND 
SECONDARY DATA 


NOTES There are various methods of data collection which help the user to gather 
and compile information from various locations. 


To understand the multitude of choices available to a researcher for 
collecting the project/study-specific information, one needs to be fully 
cognizant of the resources available for the study and the level of accuracy 
required. To appreciate the truth of this statement, one needs to examine the 
gamut of methods available to the researcher. The data sources could be either 
contextual and primary or historical and secondary in nature. 


Primary data as the name suggests is original, problem- or project- 
specific and collected for the specific objectives and needs to be spelt out 
by the researcher. The authenticity and relevance is reasonably high. The 
monetary and resource implications of this are quite high and sometimes a 
researcher might not have the resources or the time or both to go ahead with 
this method. In this case, the researcher can look at alternative sources of data 
which are economical and authentic enough to take the study forward. These 
include the second category of data sources—namely the secondary data. 


Secondary data as the name implies is that information which is not 
topical or research specific and has been collected and compiled by some 
other researcher or investigative body. The said information is recorded and 
published in a structured format, and thus, is quicker to access and manage. 
Secondly, in most instances, unless it is a data product, it is not too expensive 
to collect. As suggested in the opening vignette, the data to track consumer 
preferences is readily available and the information required is readily 
available as a data product or as the audit information which the researcher 
or the organization can procure. 


8.3 MODES OF DATA COLLECTION 


There are several methods of collecting primary data, which are as follows: 
(a) Interview method 
(b) Observation method 
(c) Survey method 
(d) Questionnaire method 
(e) Schedule method 
(f) Scaling technique 
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(g) Other methods are warranty cards, distributor audits, pantry audits, Sources and a a 
consumer panels, using mechanical devices, through projective = 
techniques, depth interviews and content analysis. 


8.3.1 Interview: Types, Conduct, Preparation, Effective Techniques NOTES 
and Limitation 


Interview is the method of collecting data that involves a presentation of 
oral and verbal stimuli and the reply in terms of oral and verbal responses. 
Interview involves both personal interview as well as telephonic interview. 


Personal Interviews 


Personal interview involves two people: the interviewer and the interviewee. 
The interviewer is the person who questions the interviewee. There is a face- 
to-face discussion between them. There can be more than one interviewer 
while taking a personal interview. There are two types of interviews: direct 
personal interview and indirect oral interview. 


In a direct personal interview, the interviewer collects information 
from the concerned sources. He should be present at the site from where 
the data has to be collected. This method is most appropriate for intensive 
investigations but this method may not be suitable in the situations where a 
direct contact with the concerned person is not possible. In such cases, an 
indirect oral examination or investigation takes place where the interviewer 
cross-examines the interviewee to check his knowledge about the problem 
under investigation. The information exchanged between the interviewee 
and the interviewer is recorded for future reference. 


Personal interviews can be of the following types: 


e Structured Interviews: If the personal interview takes place in a 
structured way, it is called a structured interview. In this type of 
personal interview, the set of questions to be asked are predefined 
and the techniques used to record the information are highly 
standardized. Structured interviews are economical, as they do 
not require much information from the interviewer. Structured 
interviews are used as a main technique to collect information in 
descriptive research studies. 


e Unstructured Interviews: If the personal interview takes place 
in an unstructured way, it means, that the questions to be asked 
to the interviewee are decided at the time of interview. In this 
type of personal interview, the set of questions to be asked are 
not predetermined and there are no standardised techniques used. 
A list of additional questions is provided to the interviewer and it 
depends on him to ask these questions or not. This method demands 
deep knowledge and greater skills of the interviewer. You can use 
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in the explorative and formulative research studies. 
Telephonic Interview 


NOTES A telephonic interview is also usually limited to two persons; however, it is 
conducted over the telephone. Telephonic interviews are generally considered 
as the initial methods for screening the candidates for the personal interviews. 
This involves testing the various skills of the interviewee that include verbal 
reasoning and oral communication skills. Some of the important tips for the 


interviewee in a telephonic interview are: 
e Must keep the resume in front of him/her. 
e Should keep the employer research materials within easy reach. 


e Should keep a note pad to highlight the important rationale of the 
interview. 


e Should talk in a calm and cool manner. 


e Should sound professional when answering the questions. 
Limitations of interview method 


e This method is time consuming. 


e It is difficult to generalize the findings due to small sample size 
used in this method. 


e The interviewer may be biased. This would make him ask close- 
ended questions, which affects the validity and reliability of interview. 


Given below is an interview guide created for a beverage purchase 
and consumption study. 


Interview guide: beverage purchase and consumption 
Introduction and Warm Up 
Hi, I am conducting a short survey on soft drink consumption. Thus, I would just take 
some insights from you on your purchase. There are no right or wrong answers, however, 
since you consume soft drinks, your opinion is really important for understanding the 
purchase behaviour. 

1. Tell me something about yourselves... what do you do—as in occupation... your 
hobbies...your interests? How would you describe yourself as a person? Do you 
generally plan and buy.... 

2. PROBE FURTHER — PSYCHOGRAPHICS/LIFESTYLE 

3. PURCHase behaviour : 


4. This soft drink that you have purchased....how do you generally consume it.... 
Chilled/cool, can/bottle, stand alone or mixed with something. 


5. If I were to ask you to list occasions for soft drinks’ purchase, they would be: 
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e brand 
e price 
e deals NOTES 
e taste 

e packaging 
e any other 


PROBE ALL ATTRIBUTES FOR REASONS. For example, what kind of deals? 
Packaging? brand image? 


7. Supposing your favourite brand is not available for purchase.....what do you 
do....... (PROBE)...... do you move on to another store or pick up another 
brand...... (PROBE) ....... reason(s) 


9. EXPOSE PICTURE 


I am going to show you some display pictures. Please tell me which one do you 
think looks attractive..... (let the respondent select)....... (PROBE reasons for 


10. EXPOSE PICTURE 


I am going to show you a picture of a store. Where would you generally expect 
the soft drinks to be placed.....in your opinion, is this the right place or can it be 
put somewhere else.....REASON 


11. Buy one get one free, a freebie, coupons, prizes. Do you get moved to try out and 
buy some of these?.......which ones did you try...... REACTION 


12. Soft drinks companies come up with a lot of ads.... can you tell me something 


about some ads? What do you recall........ (note- degree of recall and if brand 
recalled was the right match). .......did it influence your purchase of the drink. PROBE 
Thank you. 


8.3.2 Observation: Types and Techniques 


The observation method is the most common method to study behavioural 
sciences. Observation is not a scientific method but it becomes a scientific tool 
when it is used for formulating the purpose of research. In this method, the 
information collected by the researcher is totally based on his observation. For 
example, if the researcher is studying about different brands of shoes, he will 
not ask the person wearing shoes of a particular brand. Rather he will observe 
it by himself and come to some conclusion. The main advantage of this method 
is that there are no chances of partiality if the observation is done accurately. 
Secondly, the information or the data collected through observation is related 
to what is currently happening, it is not affected by the past behaviour or future 
intentions. Thirdly, this method is independent of a person’s willingness to 
respond and does not require much cooperation on the part of the person, as 


it happens to be the case in interview or questionnaire method. Observation 
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ata expressing the feelings verbally. 
Types of observation method 
NOTES Observation methods can be categorized into different types depending on 
various factors such as style for recording the observed information, data 
needed for observation and activity of the observer. Following are the different 
types of observation methods: 


e Structured observation: It is an observation method in which the 
following points need to be considered: 


o Careful definition of the matter that needs to be observed. 


o Identification of the style that must be used to record the observed 
information. 


o Standardization of the condition of observation. 
o Selection of the data required for observation. 


This method is most appropriate where a descriptive study of the matter 
under observation is required. 


Unstructured observation: It is an observation method in which a 
careful definition of the matter to be observed, the style to be recorded, 
standardized condition of observation and the selection of the required 
data of observation are not properly known. This method is most 
appropriate where an explored study of the matter under observation 
is required. 


Participant observation: It is an observation method in which the 
observer is also a member of the group he is observing in order to 
understand the needs and the problems faced by the group in a better 
way. For example, a team leader who observes all his team members 
and also does the same work as his team members. There are several 
advantages of participant observation, which are: 


o The researcher is able to record the natural behaviour of the group. 


o The researcher can even gather information, which could not 
otherwise easily be obtained if he observes from an isolated 
situation. 


o The researcher can even verify the truth of statements made by 
informants in the context of the questionnaires or a schedule. 


Non-participant observation: It is an observation method in which 
the observer is not a member of the group under observation. This 
method has a disadvantage that the observer is unable to sense what 
the other team members feel. 
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e Disguised observation: It is an observation method in which the 
members of the group are unaware of the fact that they are being 
observed. 


e Controlled observation: The observation that takes place according 
to definite pre-arranged plans, involving experimental procedures is 
called controlled observation. 


e Uncontrolled observation: The observation that takes place in the 
natural setting is called the uncontrolled observation. The main aim 
of this observation is to have spontaneous picture of the situation and 
for this the prime requirement is sufficient time. 


Limitations of observation method 


Though observation methods provide different ways for studying the 
behavioural science, there are some limitations while using these methods. 
Following are the limitations of observation methods: 


e All observation methods are generally expensive. 
e It provides very limited information regarding the observed matter. 


e It may be affected by some unwanted factors. For example, people 
who are not involved in the direct observation might create a problem 
while collecting data through observation methods. 


Points to be considered while doing observation 
In the observation methods, researchers must keep in mind the following 
points at the time of observing any information: 

e What should be observed? 

e How the observation should be recorded? 

e How the accuracy of the observation can be ensured? 


An example of observation sheet is given below: 


Observation Sheet: Organic Retailer 
Name of Store: Location: Size of Store: 
Store personnel (number): 
Store personnel (attitude): 
Store atmosphere: 
Approximate footfalls 
Weekdays: weekends 
Percentage of conversions 


Weekdays: weekends 
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Please mark (e) the items in stock 


Product Stock | Product Stock 
Tea CEREALS 
Organic Tea Amaranth 
Flavoured Amaranth Popped 
Snacks Amaranth Breakfast Cereal 
Cookies (Ragi/Ramdana) Jhangara 
Bread Ragi 
Namkins Ragi Atta 
Spices Maize 
Chilli Powder Maize Atta 
Chilli Red Wheat Atta 
Dhania Powder Wheat Dalia 
Dhania Seeds Wheat Puffed 
Haldi Whole PULSES 
Haldi Powder Arhar Dal 
Mustard Powder Bhatt Dal 
Sesame/Til Kulath Dal 
Zeera Masoor Dal 
PRESERVES Moong Sabut 
Mango Pickle Moong Dal 
Garlic Pickle Kabuli Channa 
Mixed Pickle Naurangi Dal 
Amla Chutney Rajma (Brown/White) 
Ginger Ale Rajma (Chitkabra) 
Burans Squash Rajma (Mix) 
Lemon Squash Rajma (Red Small) 
Malta Squash Urad Dal 
Pudina Squash Urad Whole 

RICE 
ANY OTHER Basmati Dehradun 

Rice Khanda 

Rice Rikhwa 

Rice Unpolished 


Rice Hansraj 


R Sources and Collection o, 
Rice Red pa 


Rice Kasturi 
Rice Kelas 


Rice Punjab Basmati 


NOTES 


Rice Ramjavan 
Rice Sela 


Check Your Progress 
1. What is primary data? 


2. What is non-participant observation? 


3. What is uncontrolled observation? 


8.4 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. Primary data as the name suggests is original, problem- or project- 
specific and collected for the specific objectives and needs to be spelt 
out by the researcher. 


2. It is an observation method in which the observer is not a member of 
the group under observation. 


3. The observation that takes place in the natural setting is called the 
uncontrolled observation. 


8.5 SUMMARY 


e There are various methods of data collection which help the user to 
gather and compile information from various locations. 


e Primary data as the name suggests is original, problem- or project- 
specific and collected for the specific objectives and needs to be spelt 
out by the researcher. 


e There are several methods of collecting primary data, which are as 
follows: 


(a) Interview method 

(b) Observation method 
(c) Survey method 

(d) Questionnaire method 
(e) Schedule method 
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(g) Other methods are warranty cards, distributor audits, pantry audits, 
consumer panels, using mechanical devices, through projective 


NOTES techniques, depth interviews and content analysis. 


e Interview is the method of collecting data that involves a presentation 
of oral and verbal stimuli and the reply in terms of oral and verbal 
responses. Interview involves both personal interview as well as 
telephonic interview. 


Personal interview involves two people: the interviewer and the 
interviewee. The interviewer is the person who questions the 
interviewee. 


The observation method is the most common method to study 
behavioural sciences. Observation is not a scientific method but it 
becomes a scientific tool when it is used for formulating the purpose 
of research. 


e Observation methods can be categorized into different types depending 
on various factors such as style for recording the observed information, 
data needed for observation and activity of the observer. 


8.6 KEY WORDS 


Secondary Data: It refers to data which is collected by someone who 
is someone other than the user. 


Interview: It refers to a meeting of people face to face, especially for 
consultation. 


Observation: It means the action or process of closely observing or 
monitoring something or someone. 


8.7 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 


1. What are the various methods of data collection? 
2. What are unstructured interview? 
Long-Answer Questions 
1. What is personal interview? Discuss the various types of personal 
interview. 
2. Examine the types and techniques of the observation method. 
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9.0 INTRODUCTION 


In the previous unit, you were introduced to the process of data collection. In 
this unit, the discussion on the collection of data will continue. As you learnt, 
there are essentially two sources of data, that is, primary and secondary data. 
Two of the important methods of collecting primary data are interviews and 
schedule. We will discuss these in detail in this unit. 


9.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Discuss the meaning and types of questionnaire 
e Explain the types and characteristics of a schedule 


e Differentiate between questionnaire and schedule 


9.2 SCHEDULE: MEANING, KINDS, ESSENTIALS, 
PROCEDURE FOR THE FORMULATION OFA 
SCHEDULE 


Aschedule is a questionnaire containing a set of questions that are required to 
ene be answered to collect the data about a particular item. A schedule generally 
etlj-Instructiona . 
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The following are the main objectives of a schedule: 


e A schedule is created for a definite item of enquiry. The schedule sets 
the boundaries for the subject under study. NOTES 


e Aschedule acts as an aid to memorise the information being collected 
by the interviewer. Since the interviewer collects the information 
from various respondents, he might get confused while analysing and 
tabulating the data. 


e A schedule helps in tabulating and analysing the data in a systematic 
and standardised manner. 


Types of schedules 


There are five types of schedules, which are as follows: 


1. Observation schedule: It is the schedule under which the observer 
observes all the activities and records all the responses of the 
respondents under some predefined conditions. The chief idea behind 
examining the activities is to verify the required information. 


2. Rating schedule: It is the schedule used to measure and rate the 
thoughts, preferences, self-consciousness, perceptions and other similar 
characteristics of the respondents. 


3. Document schedule: It is the schedule used for collecting the important 
data and preparing a source list. This schedule is mostly used to attain 
data from autobiographies, diaries or records of governments regarding 
written facts and case histories. 


4. Institution survey schedule: It is the schedule used for studying 
different problems of institutions. 


5. Interview schedule: It is the schedule under which an interviewer asks 
the questions to the interviewee and records his response in the given 
space of the questionnaire. 


Merits of the schedule method 


Following are the merits of the schedule method: 


e In this method, the researcher is always there to help the respondents. 
So, the response rate is high as compared to other methods of data 
collection. 


e The presence of researcher not only removes the doubts present in 
the minds of the respondents but also avoid fake replies from the 
respondents due to the fear of cross checking. 
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Ha the respondent. Thus, the data can be collected easily and can be relied 
upon. 


e This method helps to better understand the personality, living conditions 
NOTES 
and the values of the respondents. 


e It is easy for the researcher to detect and rectify the defects in the 
schedule during sampling. 


Limitations of schedule method 


Following are the limitations of the schedule method: 
e It is a costly and time-consuming method. 


e This method requires well-trained and experienced field workers to 
take the interview of the respondents. 


e Sometimes, the respondent may not be able to tell certain facts due to 
the personal presence of some researchers at the work. 


e If the field of research is dispersed, it becomes difficult to organise the 
various activities of the research. 


Characteristics of a good schedule 


The essential characteristics of a good schedule are as follows: 


e The information or the questions included in the schedule should be 
accurate and should enable the respondent to better understand the 
context in which the questions are asked. 


The schedule should be pre-arranged and structured in such a manner 
that the information gathered or collected is accurate and tenable. For 
this, the following points must be considered: 


o The size of the schedule should be accurate. 


o The questions in the schedule should be understandable and should 
be definite. 


o The questions should not contain any biased evaluation. 

o All the questions of the schedule should be properly interlinked. 

o Information gathered should be organised in a table so that it can 
be easily used for statistical analysis. 


Suitability of schedule method 


The schedule method is mostly applied in the following situations: 
e When the field of investigation is wide and dispersed. 
e When the researcher requires quick results at lower cost. 


e When the respondents are well trained and educated. 
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Following is the sequence in which a schedule must be organised: 


e Selection of respondents: Usually sampling method is used for the 
selection of respondents. The sample should be representative of the NOTES 
respondents and should contain all the relevant information about the 
respondents. 


e Selection and training of field workers: Since the field workers 
take the interview of the respondents and collect the required data, 
the selection of the field workers should be done carefully and proper 
training should be provided to them. 


e Conducting interviews: For a successful interview and correct result, 
the following points must be considered: 


o Follow correct approach: The field worker should approach the 
respondents in a correct manner so that the respondents can clearly 
understand the purpose of the interview. 


o Generating accurate responses: For proper and accurate response 
from the respondents, the respondents should not be misunderstood 
in their perspective and context. 


Testing the validity of the gathered data 


After the respondents fill in the schedule, the gathered data is subjected to 
certain tests in order to find out their correctness. For this, the researcher can 
again conduct the interview of the respondents and check for any variation. 
If the variations are enormous, then the gathered data is not accurate and the 
schedule is either rejected or modified. 


9.2.1 Schedules Vs. Questionnaires 


When you work with questionnaires and schedules, there are several 
similarities between the two. However, there are prominent differences, 
which differentiates the two: 


e The questionnaire is mostly sent by the interviewer to the interviewee 
by mail and is filled by the interviewee whereas, a schedule is filled 
by the interviewer at the time of interview. 


e Data collection through questionnaire is cheaper when compared to 
schedules as money is spent only in preparing schedules and mailing it. 
In schedule method, extra money is spent on appointing the interviewers 
and imparting training to them. 


e Incase of a questionnaire, the response is generally low because most 
people do not respond to the questions. On the other hand, response 
is high in the case of schedules since the interviewer fills them at the 


time of interview. 
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whereas, in case of schedules the identity of the interviewee or 
respondent is known. 


e The questionnaire method is time consuming as the respondent may 
not return the questionnaire in time. There is no such problem in the 
scheduled method as the interviewer fills the schedules at the time of 
interview. 


NOTES 


A questionnaire does not allow personal contact with the respondent. 
Schedules establish direct contact with the interviewer. 


Questionnaire method is useful only in case if the respondent is literate, 
while in case of schedule it is not necessary for the interviewee to be 
literate. 


Risk of incomplete and incorrect information is more in questionnaire, 
while in schedules, the information collected is complete and more 
accurate. 


9.33 QUESTIONNAIRE: MEANING, TYPES AND 
FORMAT OF A GOOD QUESTIONNAIRE 


The questionnaire form is an important and commonly used method of data 
collection. It is used mostly in case of large-scale enquiries. The categories of 
end users who use this technique include individuals, research workers, private 
and public organisations and governments. A questionnaire is a document 
that contains a set of questions printed or typed in a proper sequence. The 
questionnaire is sent to each individual who is supposed to answer it. This 
technique of collecting information through questionnaires is extensively 
used nowadays. The following are the advantages of a questionnaire: 


e It is cost-effective. 


e As the respondents are allowed to answer the questions according to 
their own views and understanding, this technique of data collection 
is non-partial. 

e All the respondents of the questionnaire are provided enough time to 
answer the questions. 


e In this technique, a large sample of questions can be used to make the 
results more reliable. 


In addition to the advantages mentioned above, questionnaires also 
have certain disadvantages. The disadvantages are as follows: 


e This technique has possibilities of no-response. It means that the 
respondents may or may not provide answers to all the questions asked. 


e This technique can be used only if the respondents are skilled and 
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e It is not possible to determine whether or not a particular candidate is 
appropriate to give information about a particular subject. 


e It is a time consuming technique. NOTES 


In a questionnaire, the use of standardized questions can help collect 
more data that is reliable. By using questionnaires, the system analyst can 
collect valuable information from the people in the organisation who may 
be affected by the current and proposed system. 


The various tasks performed during the questionnaire method are as 
follows: 


e Acquiring information before conducting the interview with the 
questionnaire. 


e Gaining information in order to prove facts found in the interview. 

e Acquiring information on ‘How users feel about the current system?’ 
e Is there any problem that remains unsolved? 

e What do people expect from a new or modified system? 


The following are the situations during which the questionnaires should 
be used: 


e If the people to be questioned belongs to different departments or 
branches of the same organisation. 


e If the project involves a large number of people and you want to know 
what proportion of a given group approves or disapproves of a particular 
feature of the proposed system. 


e If you want to determine the overall opinion before the systems project 
is given any consideration for implementation. 


Questions included in a questionnaire can be either closed ended or 
open ended. 
Open-ended questions 
Open ended questions are questions which do not require specific responses. 
Examples of this type of questions are as follows: 
e How will you evaluate the benefits of a new installed system? 
e How will you design the Management Information System? 


e What is your opinion about the current income tax policy? 
Closed-ended questions 


Closed ended questions are the questions, which are used when the systems 
analyst is able to effectively list all possible responses to the question. All 
possible responses of the closed questions should be mutually exclusive. This 


type of questions are categorized in to the following types: P aaa 
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e Dichotomous questions 
e Ranking scale questions 
NOTES e Multiple choice questions 


e Rating scale questions 
Fill in the blanks questions 


These are questions which require specific responses that are analysed 
statistically. The examples of this type of questions are: 


e What is your name? 
e What is the name of your organisation? 


e How many employees are there in the accounts department of your 
organisation? 


e How many automated systems are installed in your organisation? 
Dichotomous questions 
Dichotomous questions are questions which offer two answers, Yes or No. 
The examples of this type of questions are: 

e Are you working with manual systems? 


Yes or No 


e If yes, do you need to switch over to the automated systems? 
Yes or No 


e If no, are you satisfied with the performance of manual systems? 
Yes or No 


Ranking scale questions 


Ranking scale questions allow the researcher to arrange the list of items in the 
order of their importance and preference. Consider the following question: 


Please arrange the following in alphabetical order: 
e London 


e America 
e India 


e Italy 


Multiple choice questions 


These types of questions allow you to select an option from a list of options. 
The examples of this type of question are: 
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e What is the number of automated systems used in your organisation? 
o 0-9 
o 10-19 
o 20-29 
o More than 29 
e What is the type of organisation you are working with? 
o Bank 
o Manufacturing Company 
o Computer/IT Sector 
o Other 


Rating scale questions 


In this type of questions, a user is required to rate the options according to 
his opinion. The examples of this type of question are as follows: 
How skilled you are in your work? (Helps in rating your skills) 


Once Twice Many times Never 


e Number of times you got promotion. 1 2 3 4 
e Number of times you received appreciation. 1 2 3 4 
e Number of times you are criticized for work. 1 2 3 4 


Designing a questionnaire 


Questionnaire provides a data collection technique in which written questions 
are presented that are to answered by the people in written form. The points 
that should be kept in mind while designing a questionnaire are as follows: 


e The goal of the questionnaire must specify the purpose of determining 
whom you will survey and what will you ask them. 


e The questions in the questionnaire must not be confusing and unfamiliar. 
It should be easy to understand, short and simple so that it is easy to 
complete. 


e The questions in the questionnaire must be properly stated. It should 
not contain any private questions regarding the salary, age, etc. 


e The questions placed out of order or out of context should be avoided. 
There should be specific questions, which can be followed by general 
easy-to-answer questions. 


e The period or time must be stated in which the questionnaire can be 
completed. 


e The performance of the questionnaire should be determined by 
pretesting it. 


Sources and Collection of 
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pede and Collection of e The questionnaire should be finally reviewed and edited to ensure that 
ta- 3 s . zis * 
n the questionnaire is ready for administration. 


e The type of questionnaire should be properly defined. 


NOTES Reliable and valid questionnaires are designed using the scaling 
construction technique. According to this technique, the researcher must 
focus on the question content, question wording and question format. Please 
refer to appendices too. 


A Specimen questionnaire 


This hypothetical study is adapted from a study developed by Deepak 
Mehendru* in India. Assume that this study involves 200 professors in India 
area colleges who are asked about their interest in buying automobiles. The 
basic objective of this survey is to determine certain marketing trends among 
the population of professors in India area regarding their automobile buying 
patterns and are based upon the following factors: 


e The profile of the decision-maker who finally decides to buy a particular 
type of car. 


e People around the decision-maker who influence the decision-making 
process. 


e The factors affecting the selection of a particular dealer of cars. 


e People in the family who make or affect decisions regarding the 
maximum budget that can be allocated for purchasing a car. 


e The effect of various options available in the car. 
e The image and reliability of the company that makes these cars. 


e The effect of heavy promotion on television about the utility of the car 
on the decision maker. 


(For the sake of simplicity, it is assumed that the professors have only one 
car in the family.) 


The questionnaire 


1. General 
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OVET 6.si 
Yearly income 
Less than %30,000..........:c000- NOTES 
© 30,000 $39,999. cassasssatcentsocane 
%40,000-249,999 00 eeeeeeeees 
750,000 and more...............0. 
2. What type of car do you own now? 
EA EEE American 
PREKATA Japanese 
setacvurseltaduld European 
3. What size of car do you own? 
Casais Luxury 
PEET Mid-size 
SesnlisGnsgaacebs Compact 
4. Did you buy this car new or used? 
E NeW... Sed 
5. If you bought a used car, did you buy it from a dealer or a private 
party? ~~ oients DIGIT. sssr Private party 
6. If you bought a new car, how long have you owned this car? 
EEEE Number of years 
7. If you bought a used car, how old is this car now? 
rarahi Number of years 
8. Price paid for the car.......... New.......... Used 


10. 


. Who influenced your decision to purchase the above brand of car? 


Indicate if more than one. 


E Yourself entesermn LOUD WIIG 

agua dealsags Your children sisas VOUT friend 

PEA Your neighbour bassesse YOUT COLleague 

ONES erai ames EE E E E E EE f 

Indicate as to who decided about the budget allocation for the car? 

iteseveeeases Yourself 

sicdainyaadias Your spouse 

access Family decision 
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12: 


i: 


14. 


15. 


16. 


I7. 


Ifyou bought your car from a dealer, then who influenced your decision 
regarding the selection of a particular dealer? 


sorgents Yourself 

SENEN Your friend 

aaen iG Your colleague 

P E Family decision 

How did you come to know about this dealer? 
porani n TV commercial 

pienine Newspapers 

PE Personal references 

AE Others 


Rank the following factors that affected the final decision at the time 
of purchasing the car (A rank of 1 measures the most important factor, 
a rank of 2 measures the second most important factor, and so on). 


PAEA Very inconvenient without the car 
EES Money was available 

PEES Ae Ne Reputation of car manufacturer 
paditnaadeeonnes Discounts offered 

Sarain Interest rate on financing 


ieaiaia Guarantees and warranties offered 


EE ET Others 

Did you make an extensive survey regarding price comparisons after 
you decided to buy the particular car? ............ Yes... No. 

If you bought a used car, how did you learn about it?............ Newspapers 
EER Friend ............... Others 


In order of preference, what were the major reasons for buying a used 
car? 


TEA Unavailability of adequate funds 
soiree iii Cheaper insurance 

P Lack of parking garage 

E T Condition of the car 

EE Others 


Which of the following media you think is most effective in creating 
an impact on the potential customer relative to a particular brand of 
the car? 


EAEE TV eseese N eWSpapers Sources and Collection of 


. Data-II 
eee er ee Magazines seee Favourable news reports 
et ee Word of mouth ie OtherS 
The responses to such questions would form the basis of analysis in NOTES 


order to achieve the set marketing objectives. 


Questionnaire designing is an important part of research methodology 
thus, we have dealt with this topic in great detail in the appendix given at 
the end of the book. 


Characteristics of a good questionnaire 


e The questionnaire is a very important document that is the first interface 
between the respondent and the researcher. Thus, the appearance of 
the instrument is very important. The first thing is the quality of the 
paper on which the questionnaire is printed. In case the questionnaire 
is printed on a poor-quality paper or looks tattered and unprofessional, 
the respondents do not value the study and thus are not very sincere or 
careful in responding. 


In case the number of questions is too many, instead of just stapling 
the papers together, it would be a good idea to put them together as a 
booklet. They are easy for the investigator and the subject to answer. 
Secondly, one can have a double-page format for the questions and the 
appearance, then, is more sombre and professional. The format, spacing 
and positioning of the questions can have a significant effect on the 
results, especially in the case of self-administered questionnaires. 


The font style and spacing used in the entire document should be 
uniform. One must ensure that every question and its response options 
are printed on the same page. In fact, as far as possible, the response 
categories should be in the same row as the question. This saves space 
and at the same time, is more response friendly. 


In case the questionnaire is long, or the researcher is economizing, 
one must not crowd questions together with no line spacing to make 
the questionnaire seem shorter. This format could result in error while 
recording as the person could fill the answer in the wrong row. Secondly, 
in case there are open-ended questions as well, the responses would be 
less revealing and shorter. The respondent might feel that this is going 
to be a really long and complex administration and may actually lose 
interest. Thus, though it is advisable to have short instruments that 
are not too taxing, but in case here is a research need for which the 
questions cannot be shortened, one must not clutter the appearance of 
the measuring instrument (questionnaire). 
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response, sometimes it can be used to distinguish between the groups 
or for branching questions. Also, surveys for different groups could 
be on different coloured paper. This would be helpful when grouping 
the responses from different segments. For example, if Delhi is being 
studied as five zones, then the questionnaire used in each zone could 
be printed on a differently coloured paper. 


NOTES 


e As we saw in the last section, the questionnaire is segregated into 
different sections to address the various information needs. It is useful 
if the researcher divides the data needed into separate sections such as 
Sections A, B, C and so on. 


e Then the questions in each part should be numbered, especially, when 
one is using branching questions. The other advantage of numbering 
the questions is that after the conduction coding, entering the data 
obtained becomes much easier. Precoded questionnaires are easier to 
administer and record. 


e Incase there is any response instruction for an individual question, 
it must accompany the question. In case it is a schedule and there 
are instructions for asking the question as well as instructions for 
responding, the response instruction should be placed very close to the 
question. However, instructions about how to record the answer and 
any probing question that needs to be asked should be placed after the 
question. To distinguish the instructions from questions, one should 
use a different font style. For example, overall how satisfied (are/were) 
you with your [Domino’s] experience? Would you say you are (READ 


LIST)? 

Very sausned oeenn ee Coaseatecste nent E EE 5 

EENT D e a E EE 4 

Neither satisfied nor dissatisfied gcyivsie) sccssveraatiaaiiersaeren eden 2 

Dissatisfied reiissi e ar a ee 2 
Of; Very- dissáfisfhie derno nnie aa rin E EE E 1 


IN CASE OF 2 or 1 
(PROBE) What was the reason (s) for your experience? Kindly explain. 


Check Your Progress 
. When is a questionnaire form used? 
. What are open-ended questions? 


. What is an observation schedule? 


BW N e 


. What method is used for the selection of respondents? 
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QUESTIONS 
NOTES 


1. The questionnaire form is an important and commonly used method 
of data collection. It is used mostly in case of large-scale enquiries. 


2. Open ended questions are questions which do not require specific 
responses. 


3. An observation schedule is the schedule under which the observer 
observes all the activities and records all the responses of the 
respondents under some predefined conditions. 


4. Usually sampling method is used for the selection of respondents. 


9.5 SUMMARY 


The questionnaire form is an important and commonly used method 
of data collection. It is used mostly in case of large-scale enquiries. 


The questionnaire is sent to each individual who is supposed to answer 
it. This technique of collecting information through questionnaires is 
extensively used nowadays. 


e In a questionnaire, the use of standardized questions can help collect 
more data that is reliable. By using questionnaires, the system analyst 
can collect valuable information from the people in the organisation 
who may be affected by the current and proposed system. 


Closed ended questions are the questions, which are used when the 
systems analyst is able to effectively list all possible responses to the 
question. 


e Ranking scale questions allow the researcher to arrange the list of items 
in the order of their importance and preference. 


e Questionnaire provides a data collection technique in which written 
questions are presented that are to answered by the people in written 
form. 

e Incase the questionnaire is long, or the researcher is economizing, one 
must not crowd questions together with no line spacing to make the 
questionnaire seem shorter. 


e A schedule is a questionnaire containing a set of questions that are 
required to be answered to collect the data about a particular item. 
e There are five types of schedules, which are as follows: 


1. Observation schedule 
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2. Rating schedule 
3. Document schedule 
4. Institution survey schedule 
5. Interview schedule 
The information or the questions included in the schedule should be 


accurate and should enable the respondent to better understand the 
context in which the questions are asked. 


After the respondents fill in the schedule, the gathered data is subjected 
to certain tests in order to find out their correctness. 


When you work with questionnaires and schedules, there are several 
similarities between the two. However, there are prominent differences, 
which differentiates the two. 

In case of a questionnaire, the response is generally low because most 
people do not respond to the questions. On the other hand, response 
is high in the case of schedules since the interviewer fills them at the 
time of interview. 

Risk of incomplete and incorrect information is more in questionnaire, 
while in schedules, the information collected is complete and more 
accurate. 


KEY WORDS 


Schedule: It is a questionnaire containing a set of questions that are 
required to be answered to collect the data about a particular item. 


Rating Schedule: It is the schedule used to measure and rate the 
thoughts, preferences, self-consciousness, perceptions and other similar 
characteristics of the respondents. 


Questionnaire: It is a set of printed or written questions with a choice 
of answers, devised for the purposes of a survey or statistical study. 


Institution Survey Schedule: It is the schedule used for studying 
different problems of institutions. 


SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short Answer Questions 


1. List the tasks performed during the questionnaire method. 


2. What are dichotomous questions? 


3. What are the objectives of a schedule? 


4. Differentiate between questionnaires and schedules. 
Long Answer Questions 


1. What is a questionnaire? Discuss its advantages and disadvantages. 
2. Discuss how to design questionnaires. 


3. Describe the characteristics of schedules. What are the various types 
of schedules? 


4. Discuss the merits and limitations of the scheduling method. 
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10.1 Objectives 
10.2 Scaling Techniques: Meaning, Importance, and Classification 
10.2.1 Types of Measurement Scales: Nominal, Ordinal, Interval and Ratio 
10.3 Methods of Construction of Questionnaires or Schedules 
10.4 Pre-Testing of Data Collection Tools 
10.5 Validity and Reliability Methods 
10.6 Answers to Check Your Progress Questions 
10.7 Summary 
10.8 Key Words 
10.9 Self Assessment Questions and Exercises 
10.10 Further Readings 


10.0 INTRODUCTION 


In the previous units, you have been introduced to concepts related to 
collection of data. You have learnt that sources are primary and secondary 
in nature. You have also learned about the questionnaire’ and scheduling 
method of collecting data. In this unit, we will discuss scaling techniques of 
collecting data. Scaling basically is the process of generating the continuum, 
a continuous sequence of values, upon which the measured objects are placed. 
The unit will also discuss the pre-testing of data as well as the concepts of 
reliability and validity. 


10.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Describe the various techniques of scaling 
e Discuss pre-testing of data collection tools 


e Examine the concepts of validity and reliability 


10.2 SCALING TECHNIQUES: MEANING, 
IMPORTANCE, AND CLASSIFICATION 


The scaling techniques used in research can also be classified into comparative 
and non-comparative scales (Figure 10.1). 
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Comparative Scales Non-comparative Scales NOTES 
Paired Comparison 
Graphic Rating Scale P F 
(Continuous Rating Scale) Itemized Rating Scale 


Fig. 10.1 Types of Scaling 


Comparative scales 


In comparative scales it is assumed that respondents make use of a standard 
frame of reference before answering the question. For example: 


A question like ‘How do you rate Barista in comparison to Cafe Coffee 
Day on quality of beverages?’ is an example of the comparative rating scale. It 
involves the direct comparison of stimulus objects. For example, respondents 
may be asked whether they prefer Chinese or Indian food. Consider the 
following set of questions generally used to compare various attributes of 
Domino’s Pizza and Pizza Hut. 


e Please rate Domino’s in comparison to Pizza Hut on the basis of 
your satisfaction level on an 11-point scale, based on the following 
parameters: (1 = Extremely poor, 6 = Average, 11 = Extremely good). 
Circle your response: 


a. | Variety of menu options 14;2;3}4/5/6]7)8 9] 10) 11 
b. | Value for money 1/2/3/4)5/6)7/8;9} 10) 11 
c. | Speed of service (delivery time) | 1 |2 13 }4/5 |6]7}|8{|9J] 10} 11 
d. | Promotional offers 1);2);3}4/5;6];74,8)9] 10} 11 
e. | Food quality 1}2);3}4/)5})6]7},8)9] 10} 11 
f. | Brand name 1};2);3}4/5}6];7|,8)9y] 10} 11 
g. | Quality of service 1}2);3}4/5};6]74,8)97 10} 11 
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takeaway location 
i. | Friendliness of the salesperson | 1 | 2/3 |4]5|6]7]81]9 | 10} 11 
NOTES on the phone 
j. Quality of packaging 1/21/131/4/516]/7]/8]9J]10)1l 
k. | Adaptation of Indian taste 1/2/3]/4/5]/6]/7/8|9J10|1 
I Side orders/appetizers 1/21/3]/41/5]/6]/71/8]/9J10[|1l 


Comparative scale data is interpreted generally in a relative kind. The 
comparative scale includes paired comparison, rank order, constant sum scale 
and Q-sort technique to mention a few. 


We will discuss below each of the scale under comparative rating 
scales in detail below: 


Paired comparison scales: Here a respondent is presented with two objects 
and is asked to select one according to whatever criterion he or she wants to 
use. The resulting data from this scale is ordinal in nature. As an example, 
suppose a parent wants to offer one of the four items to a child—chocolate, 
burger, ice cream and pizza. The child is offered to choose one out of the two 
from the six possible pairs, i.e., chocolate or burger, chocolate or ice cream, 
chocolate or pizza, burger or ice cream, burger or pizza and ice cream or pizza. 
In general, if there are n items, the number of paired comparison would be 
(n(n— 1)/2). Paired comparison technique is useful when the number of items 
is limited because it requires a direct comparison and overt choice. In case 
the number of items to be compared is large (say 10), it would result in 45 
paired comparisons which would further result in fatigue for the respondents. 
Further, in reality a respondent does not make the choice from two items at 
a time—there are multiple alternatives available to him. 


There are many ways of analysing the paired comparison data. The 
analysis of paired comparison data would result in an ordinal scale and also 
in an interval scale measurement. This will be shown with the help of an 
example. Let us assume that there are five brands—A, B, C, D and E—and 
a paired comparison with two brands at a time is presented to the respondent 
with the option to choose one of them. As there are five brands, it will result 
in 10 paired comparisons. Suppose this is administered to a sample of 250 
respondents with the results as presented in Table 10.1. 
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The above table may be interpreted by assuming that the cell entry in 
the matrix represents the proportion of respondents who believe that ‘the 
column brand is preferred over the row brand’. For example: 


In brand A versus brand B comparison it can be said that 60 per cent 
of the respondents prefer brand B to brand A. Similarly, 30 per cent of the 
respondents prefer brand C to brand A and so on. 


To develop the ordinal scale from the given paired comparison data 
in the above table, we can convert the entries in the table to 0 — 1 scores. 
This is to show whether the column brand dominates the row brand and vice 
versa. If the proportion is greater than 0.5 in the above table, a number of 
‘l’ is assigned to that cell, which means that the column brand is preferred 
over the row brand. Whenever the proportion is less than 0.5 in above table, 
a number of ‘0’ is assigned to that cell, which means column brand does not 
dominate the row brand. The results are in Table 10.2. 


To get the ordinal relationship among the brands, we total the columns. 
Here the ordinal scale of brands is D> B > A > C > E. This means brand D 
is the most preferred brand, followed by B, A, C and E. 


Table 10.2 Conversion of Paired Comparison Data into 0 to 1 Form 


In order to obtain the interval scale data from the paired comparison 
data as presented above, the entries in the table can be analysed by using a 
technique called Thurston’s law of comparative judgement, which converts 


the ordinal judgements into the interval data. Here the proportions are 
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a can be computed. Z-value has symmetric distribution with a mean of ‘0’ and 
variance of ‘1’. If the proportion is less than 0.5, the corresponding Z-value 
has a negative sign and for the proportion that is greater than 0.5, the Z-score 
NOTES takes a positive value. The Z-scores for the paired comparison data is given 

in Table 10.3. 


Table 10.3 Z-scores for Paired Comparison Data 


ese as E E 


Interval scale value with 0.696 0.536 0.502 0.381 
change of origin 


The entries in Table 10.3 show the distance between two brands. 
Assuming that the scores can be added, the total distance is computed. The 
average distance is computed by dividing the total score by the number 
of brands. This way one obtains the absolute position of each brand. Now 
the highest negative values among all the column is added to each entry 
corresponding to the average value so that by change of origin, interval scale 
values can be obtained. This is shown in the last row and the values are of 
interval scale, indicating the difference between brands. Brand D is the most 
preferred brand and E is the least preferred brand and the distance between 
the two is 0.696. The distance between brand C and E equals 0.381. 


Rank order scaling: In the rank order scaling, respondents are presented 
with several objects simultaneously and asked to order or rank them according 
to some criterion. Consider, for example the following question: 


e Rank the following soft drinks in order of your preference, the most 
preferred soft drink should be ranked one, the second most preferred 
should be ranked two and so on. 
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Soft Drinks Rank 
Coke 
Pepsi 


Limca 


Sprite 
Mirinda 
Seven Up 


Fanta 


Like paired comparison, this approach is also comparative in nature. 
The problem with this scale is that if a respondent does not like any of the 
above-mentioned soft drink and is forced to rank them in the order of his 
choice, then, the soft drink which is ranked one should be treated as the least 
disliked soft drink and similarly, the other rankings can be interpreted. This 
scale is very commonly used to measure preferences for brands as well as 
attributes. The rank order scaling results in the ordinal data. 


Constant sum rating scaling: In constant sum rating scale, the respondents 
are asked to allocate a total of 100 points between various objects and brands. 
The respondent distributes the points to the various objects in the order of 
his preference. Consider the following example: 


e Allocate a total of 100 points among the various school into which 
you would like to admit your child. The more the points you allocate 
to a school, more preferred it is considered to be. The points should 
be allocated in such a way that the sum total of the points allocated to 
various schools adds up to 100. 


Schools Points 
DPS 
Modern School 


Mother’s 
International 


APEEJAY 
DAV Public School 


Laxman Public 
School 


Tagore International 
TOTAL POINTS 100 


Suppose Mother’s International is awarded 30 points, whereas Laxman 
Public School is awarded 15 points, one can make a statement that the 
respondent rates Mother’s International twice as high as Laxman Public 
School. This type of data is not only comparative in nature but could also result 
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weights which the consumer may assign to the various attributes of a product. 


Q-sort technique: The Q-sort technique was developed to discriminate 
among a large number of objects quickly. This technique makes use of the rank 
order procedure in which objects are sorted into different piles based on their 
similarity with respect to certain criterion. Suppose there are 100 statements 
and an individual is asked to pile them into five groups, in such a way, that 
the strongly agreed statements could be put in one pile, agreed statements 
could be put in another pile, neutral statement form the third pile, disagreed 
statements come in the fourth pile and strongly disagreed statements form 
the fifth pile, and so on. The data generated in this way would be ordinal in 
nature. The distribution of the number of statement in each pile should be 
such that the resulting data may follow a normal distribution. The number 
of piles need not be restricted to 5. It could be as large as 10 or more as the 
large number increases the reliability or precision of the results. 


NOTES 


Non-comparative scales 


In the non-comparative scales, the respondents do not make use of any frame 
of reference before answering the questions. The resulting data is generally 
assumed to be interval or ratio scale. For example: 


The respondent may be asked to evaluate the quality of food in a 
restaurant on a five point scale (1 = very poor, 2 = poor and 5 = very good). 
The non-comparative scales are divided into two categories, namely, the 
graphic rating scales and the itemized rating scales. The itemized rating 
scales are further divided into Likert scale, semantic differential scale and 
Stapel scale. All these come under the category of the multiple item scales. 


Graphic rating scale 


This is a continuous scale, also called graphic rating scale. In the graphic 
rating scale the respondent is asked to tick his preference on a graph. Consider 
for example the following question: 


e Please put a tick mark (°) on the following line to indicate your 
preference for fast food. 


1 7 

Least Most 

Preferred | | | Preferred 

To measure the preference of an individual towards the fast food one 
has to measure the distance from the extreme left to the position where a 
tick mark has been put. Higher the distance, higher would be the individual 
preference for fast food. This scale suffers from two limitations—one, if a 
respondent has put a tick mark at a particular position and after ten minutes, he 
or she is given another form to put a tick mark, it will virtually be impossible 
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The basic assumption in this scale is that the respondents can distinguish the 

fine shade in differences between the preference/attitude which need not be 

the case. Further, the coding, editing and tabulation of data generated through NOTES 

such a procedure is a very tedious task and researchers would try to avoid 

using it. Another version of graphic scale could be the following: 


e Please put a tick mark (°) on the following line to indicate your 
preference for fast food. 


1 2 3 4 5 6 7 

Preferred Preferred 

This is a slightly better version than the one discussed earlier. It 
will overcome the limitation of the scale to some extent. For example, if 
a respondent had earlier ticked between 5 and 6, it is likely that he would 
remember the same and the second time, he would tick very close to where 
he did earlier. This means that the difference in the two responses could be 
negligible. 

Another way of presenting the graphic rating scale is through smiling 
face scale. The following example would illustrate the same. 


e Please indicate how much do you like fast food by pointing to the face 
that best shows your attitude and taste. If you do not prefer it at all, 
you would point to face one. In case you prefer it the most, you would 
point to face seven. 


OOOO 


Itemized rating scale 


In the itemized rating scale, the respondents are provided with a scale that has 
a number of brief descriptions associated with each of the response categories. 
The response categories are ordered in terms of the scale position and the 
respondents are supposed to select the specified category that describes in 
the best possible way an object is rated. Itemized rating scales are widely 
used in survey research. There are certain issues that should be kept in mind 
while designing the itemized rating scale. These issues are: 


Number of categories to be used: There is no hard and fast rule as to how 
many categories should be used in an itemized rating scale. However, it is a 
practice to use five or six categories. Some researchers are of the opinion that 


more than five categories should be used in situations where small changes in 
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would find it difficult to distinguish between more than five categories. It is, 
however, a fact that the additional categories need not increase the precision 
with the attitude being measured. It is generally seen that researchers use 
five-category scales and in special cases, may increase or decrease the number 
of categories. 


NOTES 


Odd or even number of categories: It has been a matter of debate among 
the researchers as to whether odd or even number of categories are to be used 
in survey research. By using even number of categories the scale would not 
have a neutral category and the respondent will be forced to choose either 
the positive or the negative side of the attitude. If odd numbers of categories 
are used, the respondent has the freedom to be neutral if he wants to be so. 
The Likert scale (to be discussed later) is a balanced rating scale with an 
odd number of categories and a neutral point. It is generally seen that if a 
respondent is not aware of the subject matter being measured by the scale, he 
would prefer to be neutral. However, if we have selected our unit of analysis 
to be one who is knowledgeable about the study being conducted and if he 
prefers to be neutral, we should not debar him from this opportunity. 


Balanced versus unbalanced scales: A balanced scale is the one which 
has equal number of favourable and unfavourable categories. Examples of 
balanced and unbalanced scale are given below. The following is the example 
of a balanced scale: 


e How important is price to you in buying a new car? 
Very important 
Relatively important 
Neither important nor unimportant 
Relatively unimportant 
Very unimportant 


In this question, there are five response categories, two of which 
emphasize the importance of price and two others that do not show its 
importance. The middle category is neutral. 


The following is the example of the unbalanced scale. 
e How important is price to you in buying a new car? 
More important than any other factor 
Extremely important 
Important 
Somewhat important 
Unimportant 
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In this question there are four response categories that are skewed Sources and Collection of 
towards the importance given to the price, whereas one category is for the ame 
unimportant side. Therefore, this question is an unbalanced question. In the 
unbalanced scale, the numbers of favourable and unfavourable categories 
are not the same. One could use an unbalanced scale depending upon the 
nature of attitude distribution to be measured. If the distribution is dominantly 
favourable, an unbalanced scale with more favourable categories than 
unfavourable categories should be appropriate. If an unbalanced scale is 
used, the nature and degree of the unbalance in the scale should be taken 
into account during the data analysis. 


NOTES 


Nature and degree of verbal description: Many researchers believe that 
each category must have a verbal, numerical or pictorial description. Verbal 
description should be clearly and precisely worded so that the respondents 
are able to differentiate between them. Further, the researcher must decide 
whether to label every scale category, some scale categories, or only extreme 
scale categories. It is argued that a clearly defined response category increases 
the reliability of the measurement. 


Forced versus non-forced scales: An important issue concerning the 
construction of an itemized rating scale is the use of a forced scale versus 
non-forced scale. In the forced scale, the respondent is forced to take a stand, 
whereas in the non-forced scale, the respondent can be neutral if he/she so 
desires. The argument for a forced scale is that those who are reluctant to 
reveal their attitude are encouraged to do so with the forced scale. Paired 
comparison scale, rank order scale and constant sum rating scales are 
examples of forced scales. 


Physical form: There are many options that are available for the presentation 
of the scales. It could be presented vertically or horizontally. The categories 
could be expressed in boxes, discrete lines or as units on a continuum. They 
may or may not have numbers assigned to them. The numerical values, if 
used, may be positive, negative or both. 


Suppose we want to measure the perception about Jet Airways using 
a multi-item scale. One of the questions is about the behaviour of the crew 
members. Given below is a set of scale configurations that may be used to 
measure their behaviour. The following are some of the examples where 
various forms of presenting the scales are shown: 


The behaviour of the crew members of Jet Airways is: 


1. Verybad Very good 
2. Very bad 1 2 3 4 5 Very good 
3.) Very bad 

Ld 
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[_] Neither bad nor good 

[ 

L ]Very good 
4. Very bad Bad Neither bad nor good Good Very good 
Be -2 -l 0 1 2 

Very bad Neither bad nor good Very good 


Below we will describe some of the itemized rating scales which are 
very commonly used in survey research. 


Likert scale: This is a multiple item agree—disagree five-point scale. The 
respondents are given a certain number of items (statements) on which they 
are asked to express their degree of agreement/disagreement. This is also 
called a summated scale because the scores on individual items can be added 
together to produce a total score for the respondent. An assumption of the 
Likert scale is that each of the items (statements) measures some aspect of a 
single common factor, otherwise the scores on the items cannot legitimately 
be summed up. In a typical research study, there are generally 25 to 30 items 
on a Likert scale. 


To construct a Likert scale to measure a particular construct, a large 
number of statements pertaining to the construct are listed. These statements 
could range from 80 to 120. The identification of the statements is done 
through exploratory research which is carried out by conducting a focus 
group, unstructured interviews with knowledgeable people, literature survey, 
analysis of case studies and so on. Suppose we want to assess the image of a 
company. As a first step, an exploratory research may be conducted by having 
an informal interview with the customers, and employees of the company. 
The general public may also be contacted. A survey of the literature on the 
subject may also give a set of information that could be useful for constructing 
the statements. Suppose the number of statements to measure the constructs 
is 100 in number. Now samples of representative respondents are asked to 
state their degree of agreement/disagreement on those statements. Table 10.4 
gives a few statements to assess the image of the company. 


It may be noted that only anchor labels and no numerical values are 
assigned to the response categories. Once the scale is administered, numerical 
values are assigned to the response categories. The scale contains statements’ 
some of which are favourable to the construct we are trying to measure and 
some are unfavourable to it. 


For example, out of the ten statements given, statements numbering 1, 
2,4, 6 and 9 in Table 10.4 are favourable statements, whereas the remaining 
are unfavourable statements. The reason for having a mixture of favourable 
and unfavourable statements in a Likert scale is that the responses by the 
respondent should not become monotonous while answering the questions. 


Generally, in a Likert scale, there is an approximately equal number of 
favourable and unfavourable statements. Once the scale is administered, 
numerical values are assigned to the responses. The rule is that a ‘strongly 
agree’ response for a favourable statement should get the same numerical 
value as the ‘strongly disagree’ response of the unfavourable statement. 
Suppose for a favourable statement the numbering is done as Strongly disagree 
= 1, Disagree = 2, Neither agree nor disagree = 3, Agree = 4 and Strongly 
agree = 5. Accordingly, an unfavourable statement would get the numerical 
values as Strongly disagree = 5, Disagree = 4, Neither agree nor disagree = 
3, Agree = 2 and Strong agree = 1. In order to measure the image that the 
respondent has about the company, the scores are added. 


Table 10.4 Likert Scale Statements to Measure the Image of the Company 


Statement Strongly | Disagree | Neither a Strongly 
disagree agree nor 
disagree 
The company makes 
ears products 


3: It doesn’t care about the 
general public. 

4. The company leads in 
R&D to improve products 

5 


The company is not a good 
paymaster. 


The products of the 
company go through 
stringent quality tests. 


T The company has not done 
anything to curb pollution. 
It does not care about the 
community near its plant. 
9 The company’s stocks are 
good to buy or own. 
10. | The company does not 
have good labour relations. 
For example, if a respondent has ticked (V) statements numbering from 
one to ten as shown in Table 10.4, his total score would be 3 +5+4+4+ 
5+4+4+5+4+4= 42 out of 50. Now if there are 100 respondents and 


100 statements, the score on the image of the company can be worked out 
for each respondent by adding his/her scores on the 100 statements. 


The minimum score for each respondent will be 100, whereas the 
maximum score would be 500. 
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Sources and Collection of As mentioned earlier, a typical Likert scale comprises about 25-30 
mae statements. In order to select 25 statements from the 100 statements, we need 
to discard some of them. The rule behind discarding the statements is that 
those items that are non-discriminating should be removed. The procedure 


NOTES for choosing 25 (say number of statements) is shown. 


As mentioned earlier, the score for each of the respondents on each of 
the statements can be used to measure his/her total score about the image of 
the company. The data may look as given in Table 10.5. 


Table 10.5 shows that the total score for respondent no. 1 is 410, 
whereas for respondent no. 2 it is 209. This means that respondent no. | has 
a more favourable image for the company as compared to respondent no. 2. 
Now, in order to select 25 statements, let us consider statements numbering 
i and j. We note that the statement no. j is more discriminating as compared 
to statement no. i. This is because the score on statement j is very highly 
correlated with the total score as compared to the scores on statement 1. 
Therefore, if we have to choose between i and j, we will choose statement 
no. j. From this we can conclude that only those statements will be selected 
which have a very high correlation with the total score. Therefore, the 
100 correlations are to be arranged in the ascending order of magnitudes 
corresponding to each statement and only top 25 statements having a high 
correlation with the total score need to be selected. 


Table 10.5 Total Score and Individual Score of each Respondent on Various Statements 


Scores of Statements 


per p p pe 6 [oe] [os | [roe 


p00 | fm J = | wei | = fame | 


Another method of selecting the number of statements from a relatively 
large number of them is through the use of factor analysis. This aspect will 
be covered at the appropriate stage in the unit on factor analysis. 
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Semantic differential scale: This scale is widely used to compare the 
images of competing brands, companies or services. Here the respondent is 
required to rate each attitude or object on a number of five-or seven-point 
rating scales. This scale is bounded at each end by bipolar adjectives or 
phrases. The difference between Likert and Semantic differential scale is 
that in Likert scale, a number of statements (items) are presented to the 
respondents to express their degree of agreement/disagreement. However, 
in the semantic differential scale, bipolar adjectives or phrases are used. As 
in the case of Likert scale, the information on the phrases and adjectives is 
obtained through exploratory research. At times there may be a favourable 
or unfavourable descriptor (adjectives) on the right-hand side and on certain 
occasions these may be presented on the left-hand side. This rotation 
becomes necessary to avoid the halo effect. This is because the location of 
previous judgments on the scale may influence the subsequent judgements 
because of the carelessness of the respondents. The mid point of a bipolar 
scale is a neutral point. In the Likert scale, ten statements were used where 
respondents were asked to express their degree of agreement/disagreement 
regarding the image of the company. Taking the same example further, the 
semantic differential scale corresponding to those ten statements in Likert 
scale is shown below where the bipolar adjectives/phrases are separated by 
seven points. These points can be numbered as 1, 2, 3, ..., 7 or +3, +2, +1, 
0, —1, ..., -3 — for a favourable descriptor positioned on the left hand side. 
For an unfavourable descriptor the numberings would be reversed. A typical 
semantic differential scale where bipolar adjectives/phrases are positioned 
at the two extreme ends is given in Table 10.6. 


Table 10.6 Select Bipolar Adjectives/Phrases of Semantic Differential Scale 


Makes quality products Does not make quality 
products 
| 2 | Leader in [Leader in technology | | Leader in technology | Backward in technology 


EE not care about general Cares about general public 
public 


Leads inR & D P| Lagging behind in R&D 


E acen O 


Products go through Products don’t go through 


stringent quality test quality test 


Does nothing to curb Does a remarkable job in 
pollution curbing pollution 

Does not care about Cares about community 
community near plants near plants 

Company stocks good to Not advisable to invest in 
buy company stock 

Does not have good labour Has good labour relations 
relations 
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Sources and Collection of Once the scale is constructed and administered to the representative 
a respondents, the mean score for each of the descriptor is calculated. The scale 
is administered under the assumption that the numerical values assigned to 
the response categories are of interval scale in nature. This is generally the 
practice adopted by many researchers. However, if the response categories 
are treated as ordinal scale, instead of computing the arithmetic mean, median 
may be computed. In this example, we are treating the responses as the interval 
scale and hence the mean is computed. Once the mean for all the bipolar 
adjectives/phrases is computed we put the result in the form of a pictorial 
profile so as to make the comparison easy. At this time, all the favourable 
descriptors are kept on one side and all the unfavourable descriptors are 
positioned at the other. In our example, we have positioned all the favourable 
descriptors for the two companies whose image we want to compare on the 
left hand side. This is shown in Table 10.7. 


Table 10.7 Pictorial Profile based on Semantic Differential Ratings 


NOTES 


1 | Makes quality products Does not make quality 
products 


Leader in technology Backward in technology 


3 | Cares about general Does not care about 
public general public 


stringent quality test through quality test 


Done remarkable job in Done nothing to curb 
curbing pollution pollution 
Cares about community Does not care about 
near plants community near plants 

9 | Company stocks good Not advisable to invest in 
to bu company stock 

10 | Has good labour Does not have good 
relations labour relations 


Products go through Products do not go 
7 


Company A Company B 


As per the results presented in the pictorial profile, Company A is better 
than Company B in the sense that it makes quality products, leads in R&D, 
its products go through stringent quality tests, its stocks are good to buy and 
it has good labour relations. Company B is ahead of Company A as it cares 
about general public and is a good paymaster. Company A is a better than 


Company B as it is leads in technology whereas Company B is better than 
Self-Instructional 
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Company A as it has done a remarkable job in curbing pollution. However, Sources and Collection of 
these differences are not statistically significant. si 


Stapel scale: The Stapel scale is used to measure the direction and intensity 
of an attitude. At times it, may be difficult to use semantic differential scales 
because of the problem in creating bipolar adjectives. The Stapel scale 
overcomes this problem by using only single adjectives. This scale generally 
has 10 categories involving numbering —5 to +5 without a neutral point and is 
usually presented in a vertical form. The job of the respondent is to indicate 
how accurately or inaccurately each term describes the object by selecting 
an appropriate numerical response category. If a positive higher number is 
selected by the respondent, it means the respondent is able to describe it more 
favourably. Suppose a restaurant is to be evaluated on quality of food and 
quality of service, then the Stapel scale would be presented as shown below: 


NOTES 


RESTAURANT 
t5 +5 
+4 +4 
+3 +3 
+2* +2 
+1 +1 
Quality of Food Quality of Service 

-1 -1 
-2 -2 
-3 -3 
—4 —4 
-5 —5* 


In the above scale, the respondents are asked to evaluate how accurately 
each word or phrase describes the restaurant in question. They will choose a 
value of +5 if the restaurant very accurately describes the attribute and —S if it 
does not describe at all correctly the word in question. Suppose a respondent 
has chosen his options as indicated by*. This shows that the respondent 
slightly prefers the quality of food and is of the opinion that the quality of 
service is totally useless. 


10.2.1 Types of Measurement Scales: Nominal, Ordinal, Interval 
and Ratio 


There are four types of measurement scales—nominal, ordinal, interval and 
ratio scales. We will discuss each one of them in detail. The choice of the 
measurement scale has implications for the statistical technique to be used 
for data analysis. 
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on are assigned for the purpose of identification of the objects. Any object 
which is assigned a higher number is in no way superior to the one which 
is assigned a lower number. In the nominal scale there is a strict one-to- 
NOTES one correspondence between the numbers and the objects. Each number is 
assigned to only one object and each object has only one number assigned 
to it. It may be noted that the objects are divided into mutually exclusive and 
collectively exhaustive categories. 
Examples of nominal scale: 
e What is your religion? 
(a) Hinduism 
(b) Sikhism 
(c) Christianity 
(d) Islam 
(e) Any other, (please specify) 
A Hindu may be assigned a number 1, a Sikh may be assigned a number 
2, a Christian may be assigned a number 3 and so on. Any religion 
which is assigned a higher number is in no way superior to the one 
which is assigned a lower number. The assignment of numbers is only 
for the purpose of identification. We also note that all respondents 
have been divided into mutually exclusive and collectively exhaustive 
categories. For example: 
e Are you married? 
(a) Yes 
(b) No 
If a person is married, he or she may be assigned a number 101 and 
an unmarried person may be assigned a number 102. 
e In which of the following departments do you work? 
(a) Marketing 
(b) HR 
(c) Information Technology 
(d) Operations 
(e) Finance and Accounting 
(f) Any other, (please specify) 
Here also, a person working for the marketing department may be assigned 
a number 1, the one working for HR may be assigned a number 2 and so on. 
Nominal scale measurements are used for identifying food habits 
(vegetarian or non-vegetarian), gender (male/female), caste, respondents, 
ee brands, attributes, stores, the players of a hockey team and so on. 
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The assigned numbers cannot be added, subtracted, multiplied or Sources and Collection of 
divided. The only arithmetic operations that can be carried out are the count ame 
of each category. Therefore, a frequency distribution table can be prepared 
for the nominal scale variables and mode of the distribution can be worked 
out. One can also use chi-square test and compute contingency coefficient 
using nominal scale variables. 


NOTES 


Ordinal scale: This is the next higher level of measurement than the nominal 
scale measurement. One of the limitations of the nominal scale measurements 
is that we cannot say whether the assigned number to an object is higher or 
lower than the one assigned to another option. The ordinal scale measurement 
takes care of this limitation. An ordinal scale measurement tells whether an 
object has more or less of characteristics than some other objects. However, it 
cannot answer how much more or how much less. An ordinal scale tells us the 
relative positions of the objects and not the difference between the magnitudes 
of the objects. Suppose Shashi scores the highest marks in marketing and is 
ranked no. 1; Mohan scores the second highest marks and is ranked no. 2; and 
Krishna scores third highest marks and is ranked no. 3. However, from this 
statement we cannot say whether the difference in the marks scored by Shashi 
and Mohan is the same as between Mohan and Krishna. The only statement 
which can be made under ordinal scale is that Shashi has scored higher than 
Mohan and Mohan has scored higher than Krishna. The difference between the 
ranks does not have any meaningful interpretation in the sense that it cannot 
tell the difference in absolute marks between the three candidates. Another 
example of the ordinal scale could be the CAT score given in percentile form. 
Suppose a candidate’s score is 95 percentile in the CAT exam. What it means 
is that 95 per cent of the candidates that appeared in the CAT examination 
have a score below this candidate, whereas only 5 per cent have scored more 
than him. The actual score is how much less or more cannot be known from 
this statement. Examples of the ordinal scale include quality ranking, rankings 
of the teams in a tournament, ranking of preference for colours, soft drinks, 
socio-economic class and occupational status, to mention a few. Some of the 
examples of ordinal scales are listed below: 


e Rank the following attributes while choosing a restaurant for dinner. 
The most important attribute may be ranked one, the next important 
may be assigned a rank of 2 and so on. 


Attribute Rank 


Food quality 


Prices 


Menu variety 


Ambience 
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e Rank the following by placing a 1 beside the attribute you think is the 
most important, a 2 beside the attribute you think is the second most 
important and so on while purchasing a two-wheeler. 


Attribute Rank 


After sale 
service 


Prices 


Re-sale value 


Fuel efficiency 


Aesthetic appeal 


In the ordinal scale, the assigned ranks cannot be added, multiplied, 
subtracted or divided. One can compute median, percentiles and quartiles 
of the distribution. The other major statistical analysis which can be carried 
out is the rank order correlation coefficient, sign test. As the ordinal scale 
measurement is higher than the nominal scale measurement, all the statistical 
techniques which are applicable in the case of nominal scale measurement 
can also be used for the ordinal scale measurement. However, the reverse 
is not true. This is because ordinal scale data can be converted into nominal 
scale data but not the other way round. 


Interval scale: The interval scale measurement is the next higher level of 
measurement. It takes care of the limitation of the ordinal scale measurement 
where the difference between the score on the ordinal scale does not have any 
meaningful interpretation. In the interval scale the difference of the score on 
the scale has meaningful interpretation. It is assumed that the respondent is 
able to answer the questions on a continuum scale. The mathematical form 
of the data on the interval scale may be written as 


Y=at+bx where a#0 


The interval scale data has an arbitrary origin (non-zero origin). The 
most common example of the interval scale data is the relationship between 
Celsius and Farenheit temperature. It is known that: 


5 
C° =—(F°- 32). 
o ) 


ee 
Therefore, 9 9 


—160 


This is of the form Y= a+ bX, where a= and b= ; and hence it 


represents the interval scale measurement. In the interval scale, the difference 
in score has a meaningful interpretation while the ratio of the score on this 


scale does not have a meaningful interpretation. This can be seen from the 
following interval scale question: 


e How likely are you to buy a new designer carpet in the next six months? 


Very unlikely | Unlikely Neutral Likely | Very likely 
Scale A 1 2 3 4 5 
Scale B 0 1 2 3 4 
Scale C -2 -1 0 1 2 


Suppose a respondent ticks the response category ‘likely’ and another 
respondent ticks the category ‘unlikely’. If we use any of the scales A, B or 
C, we note that the difference between the scores in each case is 2. Whereas, 
when the ratio of the scores is taken, it is 2, 3 and —1 for the scales A, B and 
C respectively. Therefore, the ratio of the scores on the scale does not have 
a meaningful interpretation. The following are some examples of interval 
scale data. 


e How important is price to you while buying a car? 


Least Unimportant Neutral Important Most 
important important 
1 2 3 4 5 
e How do you rate the work environment of your organization? 
Very good Good Neither good Bad Very bad 
nor bad 
5 4 3 2 1 
e The counter-clerks at ICICI Bank, (Vasant Kunj Branch) are very 
friendly. 
Strongly Disagree Neither agree Agree Strongly 
disagree nor disagree agree 
1 2 3 4 5 


e Rate the life of the battery of your inverter. 


High 


1 2 3 4 5 
Low | | | | | 


e Indicate the degree of satisfaction with the overall performance of 


Wagon R. 
Very 1 2 3 4 5 Very 
| | | | satisfied 


dissatisfied | 
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Extremely Definitely Somewhat Somewhat Definitely Extremely 
expensive expensive expensive inexpensive inexpensive inexpensive 


NOTES 1 2 3 4 5 6 
e How likely are you to buy a new car within the next six months? 


Definitely Probably Neutral Probably will Definitely will 
will buy will buy not buy not buy 


1 2 3 4 5 


The numbers on this scale can be added, subtracted, multiplied or 
divided. One can compute arithmetic mean, standard deviation, correlation 
coefficient and conduct a t-test, Z-test, regression analysis and factor analysis. 
As the interval scale data can be converted into the ordinal and the nominal 
scale data, therefore all the techniques applicable for the ordinal and the 
nominal scale data can also be used for interval scale data. 


Ratio scale: This is the highest level of measurement and takes care of 
the limitations of the interval scale measurement, where the ratio of the 
measurements on the scale does not have a meaningful interpretation. The 
ratio scale measurement can be converted into interval, ordinal and nominal 
scale. But the other way round is not possible. The mathematical form of 
the ratio scale data is given by Y = bX. In this case, there is a natural zero 
(origin), whereas in the interval scale we had an arbitrary zero. Examples 
of the ratio scale data are weight, distance travelled, income and sales of a 
company, to mention a few. 


10.3 METHODS OF CONSTRUCTION OF 
QUESTIONNAIRES OR SCHEDULES 


We have already discussed the method of construction of questionnaires 
and schedules in Unit 9. To briefly recapitulate what we have discussed, 
the steps involved in designing a questionnaire encompass the following: 
(1) convert the research objectives into the information needed, (2) Method of 
administering the questionnaire, (3) Content of the questions, (4) Motivating 
the respondent to answer, (5) Determining the types of questions, (6) Question 
design criteria, (7) Determine the questionnaire structure, (8) Physical 
presentation of the questionnaire, (9) Pilot testing the questionnaire, (10) 
Standardizing the questionnaire. 


Check Your Progress 


1. How can scaling techniques used in research be classified? 


2. List the four types of measurement scales. 
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3. What are some examples of ratio scale data? 
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10.4 PRE-TESTING OF DATA COLLECTION TOOLS Data-III 


Pre-test is a trial test of a specific aspect of the study, such as the common 

methods of data collection or common data collection tools—schedule NOTES 
(used as tool for interviewing), questionnaire or measurement scale. It 

is the administration of the data collection instrument with a small set 

of respondents from the population for the full scale survey. If problems 

happen in the pre-test, the researcher is likely to face similar problems in 

full-scale administration. Pre-testing aims at identifying problems with the 

data collection instrument and find possible solutions. Pre-testing needs to 

be carried out in circumstances that are as akin as possible to actual data 

collection and as identical as possible to those that will be sampled. 


Survey sponsors have a major role to play in developing the data 
collection instruments being proposed, including any testing being carried 
out. Much of the accuracy and interpretability of the survey results depends 
on pre-testing, which should never be ignored. 


Need for pre-testing 


An instrument of data collection is designed in accordance with the data 
requirements of the study. However, any scrutiny by the designer and 
other researchers cannot make the instrument perfect. It needs to be tested 
empirically. As pointed by Goode and Hatt: ‘No amount of thinking, no matter 
how logical the mind or brilliant the insight, is likely to take the place of 
careful empirical checking.’ Thus, pre-testing ofa draft instrument is essential. 


Purpose of pre-testing 


Pre-testing aims at: 
e Testing whether the instrument would draw out responses needed to 
achieve the research objectives 


e Developing a suitable procedure to administer the instrument with 
reference to field conditions 


e Testing whether wording of questions is unambiguous and suited to 
the understanding of the respondents 


e Testing whether the content of the instrument is applicable and sufficient 


e Testing the other qualitative aspects of the instrument, such as question 
structure and question sequence 


Pre-Testing of Questionnaire 


e Pilot testing refers to testing and administering the designed instrument 
on a small group of people from the population under study. This is to 
essentially cover any errors that might have still remained even after 
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eee all the experiences of the conduction, including the time taken to 
administer it. If the respondent had a problem understanding a question 
or response category, the investigator should verbatim record the 
NOTES instruction he/she gave to clarify the point as this then would need to 

be incorporated in the final version of the questionnaire. 


In case a question got no answers, then it might be essential to rephrase 
the entire question. 


Even when the mode of administration is mail or Internet or self- 
administered tests, the pilot tests should always be done in a face-to- 
face interaction. Here, the researcher is able to observe and record 
responses, both verbal and non-verbal. 


Sometimes, the researcher might also get the questionnaire vetted by 
academic or industry experts for their inputs. 


Once the essential changes have been made, the researcher might carry 
out one short trial and then go ahead with the actual administration. 


As far as possible, the pilot should be a small scale replica of the actual 
survey that would be subsequently conducted. 


It is advisable to use multiple investigators for the pilot study. 


The group of investigators should be a mix of experienced and seasoned 
field investigators and inexperienced investigators as well. 


The inexperienced ones would be able to reveal the problems 
encountered in administering the measure, while the experienced field 
workers would be able to report respondent difficulties in answering 
the questions. 


The respondent’s experience of the pilot test can be recorded in two 
ways. One is protocol analysis where he is asked to speak out the 
reasoning in responding to the questions. This is recorded, as it helps 
to understand the underlying factors or mental processing involved in 
giving answers. 


The other method is called debriefing, where after the questionnaire 
has been completed, the person is asked to summarize his experience 
in terms of any problems experienced in answering or whether there 
was any confusion or fatigue while answering the questionnaire. 


The researcher must then edit the questionnaire as required and carry 
out any further pilot tests. Once this is over, he enters the pilot data to 
explore and see whether the information that is being collected through 
the questionnaire would adequately furnish the information needs for 
which the instrument was designed. 
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Methods of pre-testing questionnaires Sources and Collection of 
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The various methods or techniques of pre-testing questionnaires are listed 
and discussed as follows: 


e Respondent focus groups: Focus groups (referred to as a form of in- NOTES 
depth group interviewing) are carried out early in the questionnaire 
development cycle to evaluate the question-answering process. Such 
groups may collect information relating to a topic before the beginning 
of questionnaire construction. They help in the identification of 
differences in language, terminology, or interpretation of questions 
and response options. They are especially very useful in pre-testing 
self-administered questionnaires in order to learn about the appearance 
and formatting of the questionnaire. One of the major advantages of 
focus groups is that it provides the opportunity to monitor a great deal 
of interaction on a topic in a limited time span. 


e Behaviour coding: Behaviour coding refers to a systematic coding of 
the interaction between interviewers and respondents from live or taped 
interviews. It emphasizes on specific aspects of how the interviewer 
asked the question and how the respondent reacted. When used for pre- 
testing a questionnaire, the coding highlights interviewer or respondent 
behaviours indicative of a problem with the question, the response 
categories, or the respondent’s skill to form a satisfactory response. 


Cognitive laboratory interviews: Cognitive laboratory interviews 
comprise one-on-one interviews using a structured questionnaire in 
which respondents describe their thoughts while giving answers to the 
survey questions. They provide a vital means of finding out directly 
from respondents what their problems are with the questionnaire. In 
addition, small numbers of interviews may give valuable information 
about major problems, such as repetitions of questions and ambiguous 
concepts. As sample sizes are not large in cognitive laboratory 
interviews, repeated pre-testing of an instrument is common. 


e Respondent and interviewer debriefings: Respondent debriefings 
involve the incorporation of structured follow-up questions at the end of 
a field test interview to gather quantitative and qualitative information 
about respondents’ interpretations of survey questions. For the purpose 
of pre-testing, their prime object is to find out whether the survey 
concepts and questions are comprehended by respondents in the same 
way that the survey sponsors intended. 


Interviewer debriefings have conventionally been the primary method 
for evaluating field tests. In this method, the interviewers conducting the 
survey field tests are queried to use their direct contact with respondents so 
that the questionnaire designer’s understanding of questionnaire problems 


is enriched. 
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Data-III : : 
rates from the data gathered during a field test provides useful 
information about how well the questionnaire works. This is carried 


out by looking at how often items are missing (i.e., item non-response 
NOTES rates). 


Split-panel tests: Split-panel tests are the controlled experimental 
testing among questionnaire variants or interviewing modes for 
determining which is ‘better’ or for measuring differences between 
them. In order to pre-test multiple versions of a questionnaire, research 
requires a previously determined standard by which to judge the 
differences. Split-panel tests are also useful in standardizing the effect 
of changing questions, which is particularly significant in the redesign 
and testing of surveys where the comparability of the data gathered 
over time is a problem. 


Analysis of response distributions: The analysis of response 
distributions for an item is useful in determining whether different 
question wordings or question sequences result different response 
patterns. Such analysis is most useful when the researcher has to pre-test 
more than one version of a questionnaire or a single questionnaire in 
which some known distribution of characteristics exists for comparative 
purposes. 


Pre-Testing of Interview Schedule 


The pre-testing of interview schedule involves contact with respondents 
drawn from the same population as for the actual survey. Pre-testing includes 
the testing of question content, wording, sequence, form and layout, difficulty, 
instructions and acceptance. On the completion of pre-testing, all necessary 
changes are made to fix the identified problems. As in any high-quality 
research plan, a researcher needs to pre-test the interview protocol, or list 
of interview questions, before collecting data for the main study. In other 
words, first of all, the researcher conducts a pilot study of his list of interview 
questions with a group of persons who are demographically similar to his 
ultimate sample profile. This helps in the determination of the most logical 
and smooth-flowing order of the questions. Pre-testing also identifies wording 
issues that need to be addressed for the sake of clarity, which will enhance 
the integrity of the researcher’s data. Last but not the least, a pre-test sheds 
important light on the amount of time to be taken to conduct the interview, 
which is one of the first questions the researcher will be asked by potential 
participants. 
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10.5 VALIDITY AND RELIABILITY METHODS Data-III 


There are three criteria for good collection of data: reliability, validity and 
sensitivity. NOTES 


1. Reliability 


Reliability is concerned with consistency, accuracy and predictability of the 
scale. It refers to the extent to which a measurement process is free from 
random errors. The reliability of a scale can be measured using the following 
methods: 


Test—retest reliability: In this method, repeated measurements of the 
same person or group using the same scale under similar conditions are taken. 
A very high correlation between the two scores indicates that the scale is 
reliable. However, the following issues should be kept in mind before arriving 
at such a conclusion. 


e What should be the appropriate time difference between the two 
observations is a question which requires attention. If the time 
difference between two consecutive observations is very small (say two 
or three weeks) it is very likely that the respondents would remember 
the previous answer and may give the same answer when the instrument 
is administered the second time. This will make the instrument reliable, 
which may not actually be the case. However, if the difference between 
the two observations is very large (say more than a year) it is quite likely 
that the respondent’s answers to the various questions of the instrument 
might have actually undergone a change, resulting in poor reliability of 
the scale. Therefore, the researcher has to be very careful in deciding 
upon the time difference between the two observations. Generally, it 
is thought that a time difference of about five to six months is an ideal 
period. 


e Another problem in this test is that the first measurement may change 
the response of the subject to the second measurement. 


e The situational factors working on two different time periods may not 
be the same, which may result in different measurement in the two 
periods. 


e The second reading on the same instrument from the same subject may 
produce boredom, anger or attempt to remember the answers given in 
an initial measurement. 


e A favourable response with a brand during the period between the two 
tests might cause a shift in the individual rating by the subject. 


Split-half reliability method: This method is used in the case of multiple 


item scales. Here the number of items is randomly divided into two parts 
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Sources and Collection of and a correlation coefficient between the two is obtained. A high correlation 
eee indicates that the internal consistency of the construct leads to greater 
reliability. Another measure which is used to test the internal consistency 
of a multiple item scale is the coefficient alpha (+) commonly known as 
NOTES cronbach alpha. The cronbach alpha computes the average of all possible 
split-half reliabilities for a multiple item scale. This coefficient demonstrates 
whether the average score of all split-half of reliabilities converge to a certain 

point or not. 


The coefficient alpha does not address validity. However, many 
researchers use this as a sole indicator of validity. The alpha coefficient 
can take values between 0 and 1. The following values of alpha with their 
interpretation are suggested below: 


+=0 means There is no consistency between the various items 
of a multiple item scale 


+= 1 means There is complete consistency between various 
items of a multiple item scale 


0.80 <a < 0.95 implies There is very good reliability between the various 
items of a multiple item scale 


0.70 < a < 0.80 implies There is good reliability between the various items 
of a multiple item scale 


0.60 <a < 0.70 implies There is fair reliability between the various items 
of a multiple item scale 


a < 0.60 means There is poor reliability between the various items 
of a multiple item scale 


2. Validity 


The validity of a scale refers to the question whether we are measuring what 
we want to measure. Validity of the scale refers to the extent to which the 
measurement process is free from both systematic and random errors. The 
validity of a scale is a more serious a issue than reliability. There are different 
ways to measure validity 


Content validity: This is also called face validity. It involves subjective 
judgement by an expert for assessing the appropriateness of the construct. 
For example, to measure the perception of a customer towards Kingfisher 
Airlines, a multiple item scale is developed. A set of 15 items is proposed. 
These items when combined in an index measure the perception of Kingfisher 
Airlines. In order to judge the content validity of these 15 items, a set of 
experts may be requested to examine the representativeness of the 15 items. 
The items covered may be lacking in the content validity if we have omitted 
behaviour of the crew, food quality, and food quantity, etc., from the list. In 
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fact, conducting the exploratory research to exhaust the list of items measuring Sources and uuu 
perception of the airline would be of immense help in such a case. oe 


Concurrent validity: It is used to measure the validity of the new measuring 
techniques by correlating them with the established techniques. It involves 
computing the correlation coefficient of two measures of the same phenomena 
(for example, perception of an airline and image of a company) which are 
administered at the same time. We prepare a 15 item scale to measure the 
perception of Kingfisher Airline, which is assumed to be a valid one. Suppose 
a researcher proposes an alternative and shorter technique. The concurrent 
validity of the new technique would be established if there is a high correlation 
between the two techniques when administered at the same time under similar 
or identical conditions. 


NOTES 


Predictive validity: This involves the ability of a measured phenomena at 
one point of time to predict another phenomenon at a future point of time. 
If the correlation coefficient between the two is high, the initial measure is 
said to have a high predictive ability. As an example, consider the use of 
the common admission test (CAT) to shortlist candidates for admission to 
the MBA programme in a business school. The CAT scores are supposed 
to predict the candidate’s aptitude for studies towards business education. 


3. Sensitivity 


The sensitivity of a scale is an important measurement concept, particularly 
when changes in attitudes are under investigation. Sensitivity refers to an 
instrument’s ability to accurately measure the variability in a concept. A 
dichotomous response category such as agree or disagree does not allow the 
recording of any attitude changes. A more sensitive measure with numerous 
categories on the scale may be required. For example, adding strongly agree, 
agree, neither agree nor disagree, disagree and strongly disagree categories 
will increase the sensitivity of the scale. 


The sensitivity of scale based on a single question or a single item can 
be increased by adding questions or items. In other words, because composite 
measures allow for a greater range of possible scores, they are more sensitive 
than a single-item scale. Therefore, the sensitivity of the scale is generally 
increased by adding more response points or by adding scale items. 


Check Your Progress 
4. What is a pre-test? 


5. How can the sensitivity of scale based on a single question or item 
be increased? 
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ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


. The scaling techniques used in research can be classified into 


comparative and non-comparative scales. 


. There are four types of measurement scales—nominal, ordinal, interval 


and ratio scales. 


. Some examples of the ratio scale data are weight, distance travelled, 


income and sales of a company. 


. Pre-test is a trial test of a specific aspect of the study, such as the 


common methods of data collection or common data collection tools— 
schedule (used as tool for interviewing), questionnaire or measurement 
scale. 


. The sensitivity of scale based on a single question or a single item can 


be increased by adding questions or items. 


SUMMARY 


In comparative scales it is assumed that respondents make use of a 
standard frame of reference before answering the question. 


Comparative scale data is interpreted generally in a relative kind. The 
comparative scale includes paired comparison, rank order, constant 
sum scale and Q-sort technique to mention a few. 


In the non-comparative scales, the respondents do not make use of any 
frame of reference before answering the questions. The resulting data 
is generally assumed to be interval or ratio scale. 


The non-comparative scales are divided into two categories, namely, 
the graphic rating scales and the itemized rating scales. 


In the graphic rating scale the respondent is asked to tick his preference 
on a graph. 


In the itemized rating scale, the respondents are provided with a scale 
that has a number of brief descriptions associated with each of the 
response categories. 


If an unbalanced scale is used, the nature and degree of the unbalance 
in the scale should be taken into account during the data analysis. 


The Stapel scale is used to measure the direction and intensity of an 
attitude. At times it, may be difficult to use semantic differential scales 
because of the problem in creating bipolar adjectives. 


There are four types of measurement scales—nominal, ordinal, interval 
and ratio scales. We will discuss each one of them in detail. 


e The choice of the measurement scale has implications for the statistical Sources and Collection of 
. Data-III 
technique to be used for data analysis. io 


e Pre-test is a trial test of a specific aspect of the study, such as the 
common methods of data collection or common data collection tools— 
schedule (used as tool for interviewing), questionnaire or measurement 
scale. 


NOTES 


Pre-test is the administration of the data collection instrument with a 
small set of respondents from the population for the full scale survey. 


The pre-testing of interview schedule involves contact with respondents 
drawn from the same population as for the actual survey. 


e There are three criteria for good collection of data: reliability, validity 
and sensitivity. 


e Reliability is concerned with consistency, accuracy and predictability 
of the scale. It refers to the extent to which a measurement process is 
free from random errors. 


e The sensitivity of a scale is an important measurement concept, 
particularly when changes in attitudes are under investigation. 


10.8 KEY WORDS 


e Focus Groups: It refers to a group of people assembled to participate 
in a discussion about a product before it is launched, or to provide 
feedback on a political campaign, television series, etc. 


e Sensitivity: It refers to an instrument’s ability to accurately measure 
the variability in a concept. 


e Behaviour Coding: It refers to a systematic coding of the interaction 
between interviewers and respondents from live or taped interviews. 


e Likert Scale: It is a scale used to represent people’s attitudes to a topic. 


e Split-Panel Tests: They are the controlled experimental testing among 
questionnaire variants or interviewing modes for determining which 
is ‘better’ or for measuring differences between them. 


10.9 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 


1. What is graphic rating scale? 
2. What should be kept in mind while designing itemized rating scale? 


3. What is the staple scale used to measure? 
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5. Discuss validity. 


Long-Answer Questions 


NOTES : ; ; ; ; 
1. What are comparative rating scales? Discuss its various types. 


2. Describe the various non-comparative scales. 
3. Examine the various types of measurement scales. 
4. What is pre-test? Discuss its purpose. 


5. Describe the methods of pre-testing a questionnaire. 
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UNIT 11 PROCESSINGAND è = = 
ANALYSIS OF DATA 
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11.0 INTRODUCTION 


In the previous unit, you learnt about scaling techniques and pre-testing of data 
collection tools. In this unit, we will begin our discussion on the processing 
and analysis of data. The process of inspecting, cleaning, transforming and 
modelling data with the specific purpose of highlighting useful information, 
suggesting conclusions and supporting decision making is termed as analysis 
of data. There are multiple facets and approaches to data analysis. The data 
that is acquired must be identified as a matter of utmost importance. This is 
followed by the processing and analysis of the same in order to infer proper and 
accurate results. This unit focuses on the meaning, importance and the process 
of data analysis. 


11.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Discuss the meaning and importance of data analysis 
e Explain the process of data analysis 
e Examine the importance and significance of coding 
e Classify data according to the various class intervals 


11.2 MEANING, IMPORTANCE AND PROCESS 
OF DATA ANALYSIS: EDITING, CODING, 
TABULATION AND DIAGRAMS 


Research does not merely consist of data that is collected. Research is epee 
incomplete without proper analysis of the collected data. Processing of data material 173 


Processing and Analysis involves analysis and manipulation of the collected data by performing 
vom various functions. The data has to be processed in accordance with the outline 
laid down at the time of developing the research plan. Processing of data 
is essential for ensuring that all relevant data has been collected to perform 
comparisons and analyses. The functions that can be performed on data are 
as follows: 


e Editing 


NOTES 


e Coding 
e Tabulation 
e Classification 


Usually, experts are of the opinion that the exercise of processing and 
analysing of data is inter-related. Therefore, the two should be thought as 
one and the same thing. It is argued that analysis of data generally involves a 
number of closely-related operations, which are carried out with the objective 
of summarizing the collected data and organizing it in such a way that they 
are able to answer the research questions associated with it. 


However, in technical terms, processing of data involves data 
representation in a way that it is open to analysis. Similarly, analysis of data 
is defined as the computation of certain measures along with searching for 
the patterns of relationship that may exist among data groups. 


Editing of data 


Editing of data involves the testing of data collection instruments in order to 
ensure maximum accuracy. This includes checking the legibility, consistency 
and completeness of the data. The editing process aims at avoiding 
equivocation and ambiguity. The collected raw data is also examined to detect 
errors and omissions, if any. A careful scrutiny is performed on the completed 
questionnaires and schedules to assure that the data has the following features: 


e Accuracy 

e Consistency 

e Unity 

e Uniformity 

e Effective arrangement 


The stages at which editing should be performed can be classified as 
follows: 


e Field editing: This involves reviewing the reporting forms, by the 
investigator, that are written in an abbreviated or illegible form by the 
informant at the time of recording the respondent’s responses. Such type 
of editing must be done immediately after the interview. If performed 


after some time, such editing becomes complicated for the researcher, 
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as it is difficult to decipher any particular individual’s writing style. 
The investigator needs to be careful while field editing and restrain the 
researcher from correcting errors or omission by guesswork. 


Central editing: This kind of editing involves a thorough editing of 
the entire data by a single editor or a team of editors. It takes place 
when all the schedules created according to the research plan have 
been completed and returned to the researcher. Editors correct the 
errors such as data recorded in the wrong place or the data recorded 
in months when it should be recorded in weeks. They can provide an 
appropriate answer to incorrect or missing replies by reviewing the 
other information in the schedule. At times, the respondent can be 
contacted for clarification. In some cases, if the answer is inappropriate 
or incomplete and an accurate answer cannot be determined on any 
basis, then the editor should delete or remove that answer from the 
collected data. He/She can put a note as ‘no answer’ in this case. The 
answers that can be easily deciphered as wrong should be dropped 
from the final results. 


Besides using the above-stated methods according to the data source, 
the researcher should also keep in mind the following points while editing: 


e Familiarity with the instructions given to interviewers and coders 
e Know-how of editing instructions 

e Single line striking for deleting of an original entry 

e Standardized and distinctive editing of data 


e Initialization of all answers that are changed 
Coding of data 


The coding of data can be defined as representing the data symbolically using 
some predefined rules. Once data is coded and summarized, the researcher 
can analyse it and relationships can be found among its various categories. 


Checklist for coding 


This enables the researcher to classify the responses of the individuals 
according to a limited number of categories or classes. Such classes should 
possess the following important characteristics: 


e Classes should be appropriate and in accordance to the research problem 
under consideration. 


e They must include a class for every data element. 


e There should be a mutual exclusivity, which means that a specific 
answer can be placed in one and only one cell of a given category set. 


e The classes should be one-dimensional. This means that every class is 
defined in terms of only one concept. 


Processing and Analysis 


NOTES 
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Significance of coding 


Coding of data is necessary for its efficient analysis. Codiing facilitates 
reduction of data from a variety to a small number of classes. Thus, only 
that information which is important and critical for analysis is retained in 
the research. Coding decisions are usually taken at the designing stage of the 
questionnaire. This makes it possible to pre-code the questionnaire choices, 
which in turn, is helpful for computer tabulation. 


However, in case of hand coding, some standard method should be 
used. One such method is to code in the margin with a coloured pencil. 
The other method is to transcribe data from the questionnaire to a coding 
sheet. Whatever method is adopted, you should ensure that coding errors are 
altogether eliminated or reduced to a minimum level. 


Classification of data 


Research studies involve extensive collection of raw data and usage of 
the data to implement the research plan. To make the research plan easier, 
the data needs to be classified in different groups for understanding the 
relationship among the different phases of the research plan. Classification of 
data involves arrangement of data in groups or classes on the basis of some 
common characteristics. The methods of classification can be divided under 
the following two headings: 


e Classification according to attributes 
e Classification according to class intervals 


Figure 11.1 shows the categories of data. 


2 


Data i 
According [—____ . o According 
To Classification To 
Attributes Class 
nA Intervals 
e DESCRIPTIVE 
CLASSIFICATION 
e EXCLUSIVE 
e SIMPLE CLASS-INTERVALS 
CLASSIFICATION 
e MANIFOLD e INCLUSIVE 
CLASSIFICATION CLASS-INTERVALS 


Fig. 11.1 Data Classification 


Classification of data according to attributes 


Data is classified on the basis of similar features as follows: 


e Descriptive classification: This classification is performed according 
to the qualitative features and attributes which cannot be measured 
quantitatively. These features can be either present or absent in 
an individual or an element. The features related to descriptive 
classification of attributes can be literacy, sex, honesty, solidarity, etc. 


e Simple classification: In this classification the elements of data are 
categorized on the basis of those that possess the concerned attribute 
and those that do not. 


Manifold classification: In this classification two or more attributes are 
considered simultaneously and the data is categorized into a number of 
classes on the basis of those attributes. The total number of classes of 
final order is given by 2", where n = number of attributes considered. 


Classification of data according to class intervals 


Classifying data according to the class intervals is a quantitative phenomenon. 
Class intervals help categorize the data with similar numerical characteristics, 
such as income, production, age, weight, etc. Data can be measured through 
some statistical tools like mean, mode, median, etc. The different categories 
of data according to class intervals are as follows: 


e Statistics of variables: This term refers to the measurable attributes, as 
these typically vary over time or between individuals. The variables can 
be discrete, i.e., taking values from a countable or finite set, continuous, 
i.e., having a continuous distribution function, or neither. This concept 
of variable is widely utilized in the social, natural and medical sciences. 


Class intervals: They refer to a range of values of a variable. This 
interval is used to break up the scale of the variable in order to tabulate 
the frequency distribution of a sample. A suitable example of such 
data classification can be given by means of categorizing the birth 
rate of a country. In this case, babies aged zero to one year will form 
a group; those aged two to five years will form another group, and so 
on. The entire data is thus categorized into several numbers of groups 
or classes or in other words, class intervals. Each class interval has an 
upper limit as well as a lower limit, which is defined as ‘the class limit.’ 
The difference between two class limits is known as class magnitude. 
Classes can have equal or unequal class magnitudes. 


The number of elements, which come under a given class, is called the 
frequency of the given class interval. All class intervals, with their respective 
frequencies, are taken together and described in a tabular form called the 
frequency distribution. 
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Problems related to classification of data 


The problems related to classification of data on the basis of class intervals 
are divided into the following three categories: 


(i) Number of classes and their magnitude: There are differences 


regarding the number of classes into which data can be classified. 
As such, there are no pre-defined rules for the classification of data. 
It all depends upon the skill and experience of the researcher. The 
researcher should display the data in such a way that it should be clear 
and meaningful to the analyst. 


As regards the magnitude of classes, it is usually held that class intervals 
should be of equal magnitude, but in some cases unequal magnitudes 
may result in a better classification. It is the researcher’s objective 
and judgement that plays a significant role in this regard. In general, 
multiples of two, five and ten are preferred while determining class 
magnitudes. H.A. Sturges suggested the following formula to determine 
the size of class interval: 


where, 
i = size of class interval 


R= Range (difference between the values of the largest element 
and smallest element among the given elements) 


N = Number of items to be grouped 


Sometimes, data may contain one or two or very few elements with 
very high or very low values. In such cases, the researcher can use an 
open-ended interval in the overall frequency distribution. Such intervals 
can be expressed below two years; or twelve years and above. However, 
such intervals are not desirable, yet cannot be avoided. 


(ii) Choice of class limits: While choosing class limits, the researcher must 


determine the mid-point of a class interval. A mid-point is, generally, 
derived by taking the sum of the upper and lower limit of a class and 
then dividing it by two. The actual average of elements of that class 
interval should remain as close to each other as possible. In accordance 
with this principle, the class limits should be located at multiples of 
two, five, ten, twenty and hundred and such other figures. The class 
limits can generally be stated in any of the following forms: 


o Exclusive type class intervals: These intervals are usually stated 
as follows: 


e 10-20 
e 20-30 
e 30—40 
e 40-50 


These intervals should be read in the following way: Processing and sa 
e 10 and under 20 l 
e 20 and under 30 
e 30 and under 40 NOTES 
e 40 and under 50 


In the exclusive type of class intervals, the elements whose values are 
equal to the upper limit of a class are grouped in the next higher class. 
For example, an item whose value is exactly thirty would be put in 
30-40-class interval and not in 20—30-class interval. In other words, 
an exclusive type of class interval is that in which the upper limit of 
a class interval is excluded and items with values less than the upper 
limit, but not less than the lower limit, are put in the given class interval. 


o Inclusive type class intervals: These intervals are normally stated 
as follows: 


e 11-20 
e 21-30 
e 31-40 
e 41-50 
This should be read as follows: 
e 11 and under 21 
e 21 and under 31 
e 31 and under 41 
e 41 and under 51 


In this method, the upper limit of a class interval is also included in 
the concerning class interval. Thus, an element whose value is twenty 
will be put in 11—20-class interval. The stated upper limit of the class 
interval 11—20 is twenty but the real upper limit is 20.999999 and 
as such 11—20 class interval really means eleven and under twenty- 
one. When data to be classified happens to be a discrete one, then 
the inclusive type of classification should be applied. But when data 
happens to be a continuous one, the exclusive type of class intervals 
can be used. 


(iii) Determining the frequency of each class: The frequency of each class 
can be determined using tally sheets or mechanical aids. In tally sheets, 
the class groups are written on a sheet of paper and for each item a 
stroke (a small vertical line) is marked against the class group in which 
it falls. The general practice is that after every four small vertical lines 
in a class group, the fifth line for the element falling in the same group 
is indicated as a diagonal line through the above said four lines. This 
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enables the researcher to perform the counting of elements in each one 
of the class groups. Table 11.1 displays a hypothetical tally sheet. 


Table 11.1 A Tally Sheet 


Income groups 7 Number of families 
. Tally mark Š . 
(Rupees) (Class frequency) 
aT a 


Oo ese oe 
[390 
[oranda | 


In case of large inquiries and surveys, class frequencies can be 
determined by means of mechanical aids, i.e., with the help of machines. Such 
machines function, either manually or automatically and run on electricity. 
These machines can sort out cards at a speed of around 25,000 cards per 
hour. Although this method increases the speed, it is an expensive method. 


Tabulation of data 


In simple terms, tabulation means placing the results and data collected from 
research in a tabular form. 


Methods of tabulation 


Tabulation can be done either manually or mechanically using various 
electronic devices. Several factors like the size and type of study, cost 
considerations, time pressures and availability of tabulating machines decide 
the choice of tabulation. Relatively large data requires computer tabulation. 
Manual tabulation is preferred in case of small inquiries, when the number of 
questionnaires is small and they are of relatively short length. The different 
methods used in hand tabulation are as follows: 


e Direct tally method: This method involves simple codes, which the 
researcher can use to directly tally data with the questionnaire. The 
codes are written on a sheet of paper called tally sheet and for each 
response, a stroke is marked against the code in which it falls. Usually, 
after every four strokes against a particular code, the fifth response is 
indicated by drawing a diagonal or horizontal line through the strokes. 
These groups are easy to count and the data is sorted against each code 
conveniently. 


List and tally method: In this method, code responses may 
be transcribed into a large worksheet, allowing a line for each 
questionnaire. This facilitates listing of a large number of questionnaires 
in one worksheet. Tallies are then made for each question. 


e Card sort method: This is the most flexible hand tabulation method, 
where the data is recorded on special cards that are of convenient sizes 
and shapes and have a series of holes. Each hole in the card stands for a 
code. When the cards are stacked, a needle passes through a particular 
hole representing a particular code. These cards are then separated and 
counted. In this way, frequencies of various codes can be found out by 
the repetition of this technique. 


Significance of tabulation 


Tabulation enables the researcher to arrange data in a concise and logical 
order. It summarizes the raw data and displays the same in a compact form 
for further analysis. It helps in the orderly arrangement of data in rows and 
columns. The various advantages of tabulation of data are as follows: 


e A table saves space and reduces descriptive and explanatory statements 
to the minimum. 


e It facilitates and eases the comparison process. 


e Summation of elements and detection of omissions and errors becomes 
easy in a tabular description. 


e A table provides a basis for various statistical computations. 
Checklist for tables 


A table should communicate the required information to the reader in such a way 
that it becomes easy for him/her to read, comprehend and recall information when 
required. Certain conventions have to be followed during tabulation of data. 
These are as follows: 


e All tables should have a clear, precise and adequate title to make them 
intelligible enough without any reference to the text. 


e Tables should be featured with clarity and readability. 


e Every table should be given a distinct number to facilitate an easy 
reference. 


e The table should be of an appropriate size and tally with the required 
information. 


e Headings for columns and rows should be in bold font letters. It is a 
general rule to include an independent variable in the left column or 
the first row. The dependent variable is contained in the bottom row 
or the right column. 


e Numbers should be displayed such that they are neat and readable. 


e Explanatory footnotes, if any, regarding the table should be placed 
directly beneath the table, along with the reference symbols used in 
the table. 
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e The source of the table should be indicated just below the table. 


e The table should contain thick lines to separate data under one class 
from the data under another class and thin lines to separate the different 
subdivisions of the classes. 


e All column figures should be properly aligned. 
e Abbreviations should be avoided in a table to the best possible extent. 


e If data happens to be large, then it should not be crowded in a single 
table. It makes the table unwieldy and inconvenient. 


Tabulation can also be classified as complex and simple. The former 
type of tabulation gives information about one or more groups of independent 
variables, whereas, the latter shows the division of data in two or more 
categories. 


Diagrams 


The data we collect can often be more easily understood for interpretation 
if it is presented graphically or pictorially. Diagrams and graphs give visual 
indications of magnitudes, groupings, trends and patterns in the data. These 
important features are more simply presented in the form of graphs. Also, 
diagrams facilitate comparisons between two or more sets of data. 


The diagrams should be clear and easy to read and understand. Too 
much information should not be shown in the same diagram; otherwise, it 
may become cumbersome and confusing. Each diagram should include a 
brief and self-explanatory title dealing with the subject matter. The scale of 
the presentation should be chosen in such a way that the resulting diagram 
is of appropriate size. The intervals on the vertical as well as the horizontal 
axis should be of equal size; otherwise, distortions would occur. 


Diagrams are more suitable to illustrate the data which is discrete, 
while continuous data is better represented by graphs. We will study about 
diagrammatic elucidation in detail in the next unit. 


11.3 TYPES OF ANALYSIS 


Analysis of data is the process of transforming data for the purpose of 
extracting useful information, which in turn facilitates the discovery of some 
useful conclusions. Finding conclusions from the analysed data is known 
as interpretation of data. However, if the analysis is done, in the case of 
experimental data or survey, then the value of the unknown parameters of 
the population and hypothesis testing is estimated. 


Analysis of data can be either descriptive or inferential. Inferential 
analysis is also known as statistical analysis. The descriptive analysis is used 
to describe the basic features of the data in a study such as persons, work 


groups and organizations. The inferential analysis is used to make inferences 
from the data, which means that we are trying to understand some process 
and make some possible predictions based on this understanding. 


The three types of analyses are as follows: 


(i) Multiple regression analysis: This type of analysis is used to predict a 
single dependent variable by a set of independent variables. In multiple 
regression analysis, the independent variables are not correlated to each 
other. 


(ii) Multiple discriminant analysis: In multiple discriminant analysis, 
there is one single dependent variable, which is very difficult to 
measure. One of the main objectives of this type of analysis is to 
understand the group differences and predict the likelihood that an 
entity, i.e., an individual or an object, belongs to a particular class or 
group based on several metric-independent variables. 


(iii) Canonical correlation analysis: It is a method for assessing the 
relationship between variables. This analysis also allows you to 
investigate the relationship between two sets of variables. 


Univariate, Bivariate and Multivariate Analysis 


Many types of analyses are performed according to the variance that exists 
in the data. Such analyses is carried out to check if the differences between 
three or more variables are significant enough to evaluate them statistically. 
There are three types of such analyses; namely, univariate, bivariate and 
multivariate analyses. These types are explained below: 


(i) Univariate analysis: In this analysis, only a single variable is taken 
into consideration. It is usually the first activity pursued while analysing 
the data. It is performed with the purpose of describing each variable 
in terms of mean, median or mode, and variability. Examples of such 
analysis are averages or a set of cases that may come under a specific 
category amidst a whole sample. 


(11) Bivariate analysis: This type of examines the relationship between 
two variables. It tries to find the extent of association that exists among 
these variables. Thus, a bivariate analysis may help you; for example, 
to find whether the variables of irregular meals and migraine headaches 
are associate; and up to what extent. Here, two variables are thus 
statistically measured simultaneously. 


(iii) Multivariate analysis: This type of analysis involves observation 
and analysis of three or more than three statistical variables at a 
time. Such an analysis is performed using statistical tests or even in a 
tabular format. Thus, for example, you can study the variables of age, 
educational qualification and annual income of a given set of population 
at the same time using the multivariate analysis method. 
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Usually, these types of analyses are more convenient when performed 
in a tabular format. This involves, using a cross-classification or contingency 
table. Such a table is made of two columns and two rows, showing the 
frequencies of two variables that are displayed in rows and columns. This is 
more popularly known as constructing the bivariate table. Traditionally, the 
independent variable is displayed in columns and the dependent ones in rows. 
A multivariate table, if related to the same data, is the result of combining the 
bivariate tables. In this case, each bivariate table is known as partial table. 
Usually, a multivariate table is created with the purpose of explaining or 
replicating the primary relationship that is found in the bivariate table. Table 
11.2(a) and (b) shows an example of a bivariate table and a multivariate table. 


Table 11.2 (a) Bivariate Table 


1991 1992 1993 
Percentage of students failed | 33 per cent 38 per cent 42 per cent 
Percentage of students passed | 67 per cent 62 per cent 58 per cent 


Table 11.2 (b) Multivariate Table 


1991 1992 1993 


First Attempt | Second Attempt | Third Attempt 


Percentage of students who | 27 percent | 35 per cent = 
passed in Maths 


Percentage of students who | 53 percent | 60 per cent 44 per cent 
passed in English 


Although the data in both tables is related, except the variable of 
‘attempts’, the multivariate table has been displayed separately in this 
example. However, you should note that the tables have dealt simultaneously 
with two or more variables of the data. 


Data interpretation 


Data interpretation refers to the identification of trends in different variables. 
The researcher uses statistics for this purpose. The researcher is required to 
be familiar with the knowledge of the scales of measurement. This enables 
him/her to choose the appropriate statistical method for his/her research 
project. The scales of measurement facilitate the allotment of numerical 
values to characteristics adhering to any specific rules. This measurement is 
also related to such levels of measurement of data like nominal, ordinal and 
internal and ratio levels. These levels can be explained as follows: 


e Nominal measurement: The nominal measurement assigns a numeral Processing and Analysis 
sas a ike š of Data 
value to a specific characteristic. It is the fundamental form of 4 
measurement. The nominal measurement calculates the lowest level 


of data available for measurement. 


l : : NOTES 
e Ordinal measurement: This type of measurement involves allotting a 


specific feature to numeral value in terms of a specific order. The ordinal 
scale displays the way in which the entity is measured. The ordinal 
scale of measurement is used to calculate and derive data pertaining 
to the median, percentage, rank order, correlations and percentile. 


Interval measurement: A researcher can depict the difference 
between the first aspect of a data and another aspect using this level 
of measurement. The interval scale of measurement is useful for 
the researcher in several ways. It can be applied in the calculation 
of arithmetic mean, averages, standard deviations and determining 
correlation between different variables. 


e Ratio measurement: In this method, there are fixed proportions (ratio) 
between the number numerical and the amount of the characteristics that 
it represents. A researcher should remember while measuring the ratio 
levels that, a fixed zero point exists. The ratio level of measurement 
facilitates researchers in determining, if the aspects possess any certain 
characteristic. Almost any type of arithmetical calculations can be 
executed using this scale of measurement. 


The most important feature of any measuring scale is its reliability and 
validity, which is explained as follows: 


e Reliability: It is the term used to deal with accuracy. A scale 
measurement can be said to be reliable, when it exactly measures, only 
that what it is supposed to measure. In other words, when the same 
researcher repeats a test, i.e., with a different group but resembling the 
original group, he/she should get the same results as the former. 


e Validity: According to Leedy, validity is the assessment of the 
soundness and the effectiveness of the measuring instrument. There 
are four types of validity, which can be stated as follows: 


o Content validity: It deals with the accuracy with which an 
instrument measures the factors or content of the course or situations 
of the research study. 


o Prognostic validity: It depends on the possibility to make 
judgements from results obtained by the concerned measuring 
instrument. The judgement is future oriented. 


o Simultaneous validity: This involves comparing of one measuring 
instrument with another; one that measures the same characteristic 
and is available immediately. 
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Multiple regression analysis is a statistical tool that helps the researchers 
to evaluate the effect of different factors on the consequences occurring at 

NOTES the same time. It analyzes the relationship between several independent 
or predictor variables and a dependent variable. In research technology, 
regression analysis is used to investigate a particular set of predictors and 
to show differences in the consequences that occur. Generally, regression is 
used to determine the effect of the specific factors along with the other factors 
that influence these consequences. The researchers use algebraic methods to 
analyze the result by making a group of factors associated with a particular 
phenomenon as a constant. According to the dictionary meaning, the multiple 
regression is a statistical technique that predicts values of one variable on 
the basis of two or more other variables. 


Multiple regression and statistics: The term ‘multiple regression’ was first 
given by Pearson. The regression is of two types, simple and multiple and both 
the regression techniques are related to the Analysis Of Variance (ANOVA). 
Of these, multiple regression is the simplest method in comparison to other 
multivariate statistical techniques. 


Multiple regression and mathematics: The multiple regression technique 
is used in mathematics to formulate simple regression equations, and to 
evaluate the best fitting curve for a straight line along the dots on an x-y plot 
or a scattergram. 


Check Your Progress 


1. What do you mean by processing of data? 

2. List the functions that can be performed on data. 

3. What is ‘field editing’? 

4. Data can be classified into three categories. What are they? 


5. List three types of analyses. 


11.4 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. Research is incomplete without proper analysis of the collected data. 
Processing of data involves analysis and manipulation of the collected 
data by performing various functions. 


2. The functions that can be performed on data are: 
e Editing 
e Coding 
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e Tabulation 


e Classification 


. The method of field editing involves reviewing reporting forms by the 


investigator that are written in an abbreviated form by the informant. 
This kind of editing is usually done immediately after the interview. 


. Data can be classified into three categories, they are, descriptive 


classification, simple classification and manifold classification. 


. Three types of analysis are: 


e Multiple regression analysis 
e Multiple discriminant analysis 


e Canonical correlation analysis 


SUMMARY 


Research does not merely consist of data that is collected. Research is 
incomplete without proper analysis of the collected data. 


Data processing involves analysis and manipulation of the collected 
data by performing various functions. The data has to be processed in 
accordance with the outline laid down when the research plan in being 
developed. 


Editing of data involves the testing of data collection instruments in 
order to ensure maximum accuracy. 


A collected data must have five features, such as accuracy, consistency, 
unity, uniformity and effective arrangement. 


Representing the data symbolically by using some predefined rules is 
termed as coding of data. Coding of data is very much essential for 
performing efficient analysis. 


Data can be classified into three categories according to attributes and 
into two as per class intervals. 


Tabulation means placing the results and data collected from research in 
a tabular form. Tabulation can be done either mechanically or manually 
using various electronic devices. 


The process of tabulation enables the researcher to arrange data in a 
concise and logical order. It summarizes raw data and displays the 
same in a compact form for further analysis. 


Analysis of data is the process of transforming data for the purpose of 
extracting useful information, which in turn facilitates the discovery 
of some useful conclusions. 
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e Analysis of data can be either descriptive or inferential. Inferential 
analysis is also known as statistical analysis. 


The descriptive analysis is used to describe the basic features of the 
data in a study such as persons, work groups and organizations. 


e The inferential analysis is used to make inferences from the data, which 
means that we are trying to understand some process and make some 
possible predictions based on this understanding. 


Many types of analyses are performed according to the variance that 
exists in the data. Such analyses is carried out to check if the differences 
between three or more variables are significant enough to evaluate them 
statistically. 


Data interpretation refers to the identification of trends in different 
variables. The researcher uses statistics for this purpose. 


Multiple regression analysis is a statistical tool that helps the researchers 
to evaluate the effect of different factors on the consequences occurring 
at the same time. 


Multiple regression analyses the relationship between several 
independent or predictor variables and a dependent variable. 


11.6 KEY WORDS 


e Coding of Data: It refers to a symbolic representation of date using 
some predefined rules. 


e Analysis of Data: It refers to the process of transforming data for the 
purpose of extracting useful information. 


e Multiple Regression Analysis: It is a statistical tool that helps 
the researchers to evaluate the effect of different factors on the 
consequences occurring at the same time. 


11.7 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 


1. What is processing and data analysis? 
2. What is central editing? 
3. Briefly discuss the significance of coding. 


4. Write a short note on the classification of data according to attributes. 


Long-Answer Questions 


1. Examine the classification of data. 
2. What are the problems related to classification of data? Discuss. 


3. Define tabulation and explain its methods. What is its significance? 
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12.0 INTRODUCTION 


In the previous unit, you were introduced to the processing and analysis of 
data. In this unit, the discussion on the analysis and processing of data will 
continue. It will discuss various parametric tests such as T test, F test and 
Z test. The unit will begin with a discussion on the fundamentals of testing 
procedure, as well as the various types of hypothesis testing. 


12.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Discuss the fundamentals of hypothesis testing procedure 
e Examine the various types of parametric tests 


e Illustrate the chi-square test, t-test and z-test 


Test of Hypothesis 


12.2 FUNDAMENTALS OF TEST PROCEDURE 


The following fundamental steps are followed in testing of a hypothesis: 


Setting up of a hypothesis: First step is to establish the hypothesis to be NOTES 


tested. As it is known, these statistical hypotheses are generally assumptions 
about the value of the population parameter; the hypothesis specifies a single 
value or a range of values for two different hypotheses rather than constructing 
a single hypothesis. These two hypotheses are generally referred to as the (1) 
null hypotheses denoted by H, and (2) alternative hypothesis denoted by H.. 


The null hypothesis is the hypothesis of the population parameter 
taking a specified value. In case of two populations, the null hypothesis is of 
no difference or the difference taking a specified value. The hypothesis that 
is different from the null hypothesis is the alternative hypothesis. If the null 
hypothesis H, is rejected based upon the sample information, the alternative 
hypothesis H, is accepted. Therefore, the two hypotheses are constructed in 
such a way that if one is true, the other one is false and vice versa. There 
can also be situations where the researcher is interested in establishing the 
relationship between any two variables. In such a case, a null hypothesis is set 
as the hypothesis of no relationship between those two variables; whereas the 
alternative hypothesis is the hypothesis of the relationship between variables. 
The rejection of the null hypothesis indicates that the differences/relationship 
have a statistical significance and the acceptance of the null hypothesis means 
that any difference/relationship is due to chance. 


Setting up of a suitable significance level: The next step in the testing of 
hypothesis exercise is to choose a suitable level of significance. The level of 
significance denoted by a is chosen before drawing any sample. The level 
of significance denotes the probability of rejecting the null hypothesis when 
it is true. The value of a varies from problem to problem, but usually it is 
taken as either 5 per cent or 1 per cent. A 5 per cent level of significance 
means that there are 5 chances out of hundred that a null hypothesis will get 
rejected when it should be accepted. This means that the researcher is 95 per 
cent confident that a right decision has been taken. Therefore, it is seen that 
the confidence with which a researcher rejects or accepts a null hypothesis 
depends upon the level of significance. When the null hypothesis is rejected 
at any level of significance, the test result is said to be significant. Further, if 
a hypothesis is rejected at 1 per cent level, it must also be rejected at 5 per 
cent significance level. 


Determination of a test statistic: The next step is to determine a suitable 
test statistic and its distribution. As would be seen later, the test statistic 
could be t, Z, x’ or F, depending upon various assumptions to be discussed 
later in the book. 
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Determination of critical region: Before a sample is drawn from the 
population, it is very important to specify the values of test statistic that will 
lead to rejection or acceptance of the null hypothesis. The one that leads to 
the rejection of null hypothesis is called the critical region. Given a level of 
significance, a, the optimal critical region for a two-tailed test consists of 
that a/2 per cent area in the right hand tail of the distribution plus that o/2 
per cent in the left hand tail of the distribution where that null hypothesis is 
rejected. Therefore, establishing a critical region is similar to determining a 
100 (1 — a) per cent confidence interval. 


Computing the value of test-statistic: The next step is to compute the value 
of the test statistic based upon a random sample of size n. Once the value of 
test statistic is computed, one needs to examine whether the sample results 
fall in the critical region or in the acceptance region. 


Making decision: The hypothesis may be rejected or accepted depending 
upon whether the value of the test statistic falls in the rejection or the 
acceptance region. Management decisions are based upon the statistical 
decision of either rejecting or accepting the null hypothesis. 


If the hypothesis is being tested at 5 per cent level of significance, 
it would be rejected if the observed results have a probability less than 5 
per cent. In such a case, the difference between the sample statistic and 
the hypothesized population parameter is considered to be significant. On 
the other hand, if the hypothesis is accepted, the difference between the 
sample statistic and the hypothesized population parameter is not regarded 
as significant and can be attributed to chance. 


12.2.1 Types of Hypothesis Testing 


A hypothesis is tested to identify the errors that have occurred in the 
statements and concepts used in that hypothesis. Hypothesis testing can be 
broadly divided into two types, which are as follows: 


e Parametric tests or standard tests of hypothesis 
e Non-parametric tests or distribution-free tests of hypothesis 


There are the two general classes of statistical tests. Parametric tests 
are more powerful because their data is either of interval or ratio level and 
based on the following assumptions: 


(a) The observations must be independent. 


(b) The observations need to be drawn from populations that are normally 
distributed. 


(c) The populations should have equal variances. 


It is the researcher’s responsibility to check the assumptions relevant 
to the chosen test. Some of the popular parametric tests are z-test t-test and 
F-test. 


Non-parametric tests, on the other hand, have less stringent and fewer 
assumptions. They do not specify normally distributed populations or equality 
of variances. Some tests require independence of cases, while others are 
designed expressly for situations with related cases. Non-parametric tests are 
generally used for qualitative analysis (ordinal or nominal-level data). Both 
the categories of tests provide efficient results provided their selections are 
appropriate. Non-parametric tests include chi-square, run-test, Mann-Whitney 
test, Kruskal-Wallis test, etc. 


1. Parametric tests or standard tests of hypothesis 


Parametric tests assume certain properties of the population sample such 
as observations from a normal population, large sample size, population 
parameters like mean and variance. The various parametric tests of hypothesis 
are based on the assumption of normality. In other words, the source of data 
for them is normally distributed. They can be listed as follows: 


e Z-test: This kind of test is based on normal probability distribution. It 
is mostly used to judge the significance of mean as a statistical measure. 
This is the most frequently used test in research studies. It is generally 
used to compare the mean of a sample with the hypothesized mean 
of the population. It is also used in case the population variance is 
known. It is helpful in judging the significance of difference between 
the means of two independent large samples, to compare the sample 
proportion to a theoretical value of population proportion and to judge 
the significance of median, mode and coefficient of correlation. 


e T-test: This test is based on t-distribution and is used to judge the 
significance of a sample mean or the difference between the means of 
two small samples when the population variance is not known. 


x7: This test is based on a chi-square distribution and is used for 
comparing a sample variance to a theoretical population variance. 


e F-test: This test is based on F-distribution and is also used to compare 
the variance of two independent samples. It is also used to compare 
the significance of multiple correlation coefficients. 


2. Non-parametric tests or distribution-free tests of hypothesis 


There are situations in testing where assumptions cannot be made. In such 
situations, non-parametric methods are employed. There are various types 
of non-parametric tests. The important ones are as follows: 


e Sign test: This is one of the easiest tests in practice based on the plus/ 
minus sign of an observation in a sample. The sign may be one of the 
following two types: 


o One-sample sign test: This is a very simple distribution-free test 
and is applied in case of a sample from a continuous symmetrical 
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population, wherein the probability of a sample to be either less 
or more than mean is half. Here, to test a null hypothesis, all those 
items, which are greater than the alternate hypothesis, are replaced 
by aplus sign and those, which are less than the alternate hypothesis, 
are replaced by a minus sign. 


o Two-sample sign test: In case of all the problems consisting of 
paired data, two-sample sign test is used. Here, each pair of values 
can be replaced with a plus sign in the first value of the first sample 
with the first value of the second sample. If the first value is less, 
minus sign is assigned. 


e Fisher—Irwin test: This is applied where there is no difference between 
two sets of data. In other words, it is used where you can assume that 
two different treatments are supposedly different in terms of the results 
that they produce. It is applied in all those cases where result for each 
item in a sample can be divided into one of the two mutually exclusive 
categories. 


e McNamara test: It is applied where the data is nominal in nature and 
is related to two interrelated samples. By using this test, you can judge 
the significance of any observed changes in the same subject. 


e Wilcoxon matched-pairs test: This test is applied in the case of a 
matched-pair such as output of two similar machines. Here, you can 
determine both the direction and the magnitude between the matched 
values. This test is also called Signed Rank Test. 


Check Your Progress 
1. What are the two types of hypothesis testing? 


2. When is the Fisher-Irwin test applied? 


12.3 PARAMETRIC TESTS 


A parametric statistical test is one that makes assumptions about the 
parameters (defining properties) of the population distribution(s) from which 
one’s data are drawn. Let us study these tests in detail. 


12.3.1 Tests Concerning Means in Case of Single and Two 
Population Means- Z-test 


In case the sample size n is large or small but the value of the population 
standard deviation is known, a Z-test is appropriate. There can be alternate 
cases of two- tailed and one-tailed tests of hypotheses. Corresponding to the 
null hypothesis H, : u = Hy the following criteria could be used as shown 
in Table 12.1. 


The test statistic is given by, 


_ X= io 
— o 


vn 


Z 


where, 


X = Sample mean 


o = Population standard deviation 

Hyo = The value of u under the assumption that the null hypothesis is 
true 

n = Size of sample 


Table 12.1 Criteria for accepting or rejecting null hypothesis under different cases of 
alternative hypotheses 


Alternative Hypothesis Reject the Null Accept the Null 
Hypothesis if Hypothesis if 


H < Ho Z2-Z, 


H > Ho ZSZ, 
H * Uo -= Zan Ś Z $ Zan 


Ifthe population standard deviation o is unknown, the sample standard 
deviation 


is used as an estimate of o. It may be noted that Z, and Z are Z values such 
that the area to the right under the standard normal distribution is a and a/2 
respectively. Below are solved examples using the above concepts. 


Example 12.1 


A sample of 200 bulbs made by a company give a lifetime mean of 1540 
hours with a standard deviation of 42 hours. Is it likely that the sample has 
been drawn from a population with a mean lifetime of 1500 hours? You may 
use 5 per cent level of significance. 


Solution: 


In the above example, the sample size is large (n = 200), sample mean (X) 
equals 1540 hours and the sample standard deviation (s) is equal to 42 hours. 
The null and alternative hypotheses can be written as: 


H f u = 1500 hrs 


0 


H : u # 1500 hrs 


1 


Test of Hypothesis 


NOTES 
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Test of Hypothesis It is a two-tailed test with level of significance (a) to be equal to 0.05. 
Since n is large (n > 30), though population standard deviation o is unknown, 
one can use Z-test. The test statistics are given by: 


NOTES 7 aX — Hie 
OR 


where, H, = Value of u under the assumption that the null hypothesis is true 


C. = Estimated standard error of mean 
Here, Uy, = 1500, g= = 5 = “42 _ =297 


(Note ô that is estimated value of o.) 


_X-HHo _1540-1500 _ 40 
Ss. 2.97 2.97 
Viv 


Z = 13.47 


The value of a = 0.05 and since it is a two-tailed test, the critical value 
Z is given by — Z and Z which can be obtained from the standard normal 
table. 


Rejection Rejection 


Fig 12.1 Rejection regions for Example 12.1 


Since the computed value of Z = 13.47 lies in the rejection region, the 
null hypothesis is rejected. Therefore, it can be concluded that the average 
life of the bulb is significantly different from 1500 hours. 


Alternative Approach to the Test of Hypothesis 


There is an alternative approach called probability approach or simply p 
value approach to test the hypothesis. Under this approach, the researcher 
does not have to refer to Z table to determine the critical value. Referring to 
Example 12.1, the p value can be calculated as follows: 


p =P (Z > 13.47) + P (Z <-13.47) 
We know that the problem is that of a two-sided test and Z has a 
symmetric distribution, therefore, 
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p =2P(Z> 13.47)=2x0=0 
Now, the decision rule is: 
Reject H if p<a 


0 
Accept H, if p>a 

In this example, a = 0.05 and p value is less than a, so the null 
hypothesis is rejected. Therefore, it may be noted that the same conclusion 
is arrived at and there is no need to look at the critical value of Z as given in 
the statistical table. These days, most computer software like SPSS, EXCEL, 
SAS, MINITAB provide both the computed value of test statistic and the 
corresponding p value. Please note that the p value provided there is for the 
two-sided test. In case the problem is ofa one-sided test, the reported p value is 
divided by 2 to obtain the desired p value for the problem and then compared 
with alpha (a), the level of significance so as to either accept or reject the null 
hypothesis. This is possible since Z-distribution is a symmetrical distribution. 


Tests for Difference between Two Population Means 


So far we have been concerned with the testing of means of a single 
population. We took up the cases of both large and small samples. It would 
be interesting to examine the difference between the two population means. 
Again, various cases would be examined as discussed below: 


Case of Large Sample 


In case both the sample sizes are greater than 30, a Z-test is used. The 
hypothesis to be tested may be written as: 


Ho à : H =H, 
Ho è : HFK, 
where, 
u, = Mean of population 1 
u, = Mean of population 2 


The above is a case of two-tailed test. The test statistic used is: 


7 -(X1-X2)-(m-H2)Ho 


2 
OF , 03 
ny n2 
a = Mean of sample drawn from population 1 
X 


2 = Mean of sample drawn from population 2 


n, = Size of sample drawn from population 1 


Test of Hypothesis 


NOTES 
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Test of Hypothesis n, =Size of sample drawn from population 2 


If o and ©, are unknown, their estimates given by ô, and ô, are used. 


“| 


NOTES G, =s,= (X-X)? 


-1 i=1 


n 
r | 1 pa 


The Z value for the problem can be computed using the above formula 
and compared with the table value to either accept or reject the hypothesis. 
Let us consider the following problem. 


Example 12.2 


A study is carried out to examine whether the mean hourly wages of the 
unskilled workers in the two cities—Ambala Cantt and Lucknow are the 
same. The random sample of hourly earnings in both the cities is taken and 
the results are presented in the Table 12.2. 


Table 12.2 Survey Data on Hourly Earnings in Two Cities 


City Sample Mean Standard Deviation Sample Size 
Hourly Earnings of Sample 


Ambala Cantt 78.95 (X4) 0.40 (s4) 200 (n4) 
Lucknow 79.10 (X2) 0.60 (s3) 175 (n>) 


Using a 5 per cent level of significance, test the hypothesis of no difference 
in the average wages of unskilled workers in the two cities. 


Solution: 
We use subscripts 1 and 2 for Ambala Cantt and Lucknow respectively. 
Ho : H =u, > Hy H, =9 
H, : FH, > H =u, #0 
The following survey data is given: 
X, =8.95, X, =9.10, s, =0.40, s, =0.60, n, =200, n, = 175, a= 0.05 


Since both n,, n, are greater than 30 and the sample standard deviations 
are given, a Z-test would be appropriate. 


The test statistic is given by 
Z= (X,—Xp)—(Hy—Hy)Ho 


2 
o o 
oi %2 
ny Mg 
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As ©, 6, are unknown, their estimates would be used. Test of Hypothesis 


S= Ô, S= ô, 
a2 (0.4)2 (0.6)? NOTES 
O2 _ — = 
V2 = 00 + 175 = V 0.0028 = 0.0053 
(8.95 - 9.10) - 0 
= — pa 7a 


As the problem is ofa two-tailed test, the critical values of Z at 5 per cent 
level of significance are given by — Z =—1.96 and Z, = 1.96. The sample 
value of Z = -2.83 lies in the rejection region as shown in the figure below: 


Sample Rejection 
Value Region 


/ 


-2.83 —1.96 1.96 


Rejection 
Region 


Fig 12.2 Rejection regions for Example 12.2 


Therefore, the null hypothesis is rejected and it may be concluded that 
there is a difference in the average wages of unskilled workers in the two 
cities. Let us rework the same problem using the p value approach. As it is 
known that the problem is of a two-tailed test, the p value is given by: 


p = P (Z < -2.83) + P (Z > 2.83) 
= 2P (Z > 2.83) 

2 x (0.5 —=0.4977) 

2 x 0.0023 

= 0.0046 


As the value of p is less than a (0.05), the null hypothesis is rejected. 
Similarly, the problems on one-tailed tests can be solved. 


ll 


ll 


12.3.2 Hypothesis Testing for Comparing Two Related Terms: T-test 


The method for comparing two related samples in hypothesis testing is the 
paired t-test. For this test, it is necessary that the observations in the two 
samples should be collected in the form of matched pairs. It means that each 
observation in one sample must be paired with an observation in the other 
sample in such a way that they are matched excluding the other factors which 
do not fall within the area of this test. 
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Test of Hypothesis An important formula for calculating this is: 
= JD 
D n 
n 


NOLES And the formula for calculating the variances of the differences is: 


$D -(D) x n 


n-l 


(Oi) E 
Where, D = Mean of differences. 


The t-test is based on t-distribution, which is a probability distribution 
that arises in the problem of estimating the mean of a normally distributed 
population when the sample size is small. t-distribution arises when the 
population standard deviation is unknown and has to be estimated from the 
data. 


In Student’s t-distribution, the random variable is assumed as X with 


mean u and variance o°, Zythe standard normal statistic is assumed as X 
and X? be a random variable, which follows chi-square distribution with t 
degrees of freedom. 


If the variables are relatively independent with each other, then the 
t-distribution will be: 


Ps J|- 


T 


The standard normal statistic of and chi-square statistics are: 


and 


(n—-1)S? 
Xx’ = -o with (n — 1) degrees of freedom. 


After substituting these two statistics in t, t-distribution is represented as: 
ETE f freed 
=——— F with (n —1) degrees of freedom. 
T ie ae 


The t-distribution can be used only when the sample size is not more 
than 30; when the sample size is more than 30, then it can be approximated 
to a normal distribution. 
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12.3.3 Hypothesis Testing of Proportions, Difference between Test of Hypothesis 
Proportions and Comparing Variance 


You can test proportions by using hypothesis testing. The formula for 
hypothesis testing of proportions is as follows: NOTES 


pxq 


Standard deviation of the proportion of successes = 


Ifn is large, binomial distribution tends to become normal distribution. 
For proportion testing, you use static z as under: 


A 


Z= P— Po 
Poo 
n 


(b) Hypothesis testing for differences between proportions 


Iftwo samples are drawn from different populations, one may be interested in 
knowing whether the difference between the proportions is significant or not. 


The formula for testing the significance of difference is as under: 


Pı- P: 


Where, p, = Proportion of success in sample one 
p, = Proportion of success in sample two 

q,=1—p, 

q,=1—p, 

n, = Size of small one 


n, = Size of sample two 
Hypothesis testing for comparing a variance: Chi-square test 


This test is used to compare a sample variance to some theoretical or 
hypothesized variance of population. It is different from z-test and t-test. 
The test used for this purpose is known as chi-square test. It is used to test 
null hypothesis. 


The formula for this is as follows: 
2 
o 
x = = m-l!) 
Oo» 


or 


vena of the semple z Dearseot Headon 
Variance of population 
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Where n = number of items in the sample 


By comparing the calculated value of chi-square test with n—1 degrees 
of freedom at a given level of significance, you can determine whether H, 
is accepted or rejected. 


The chi-square test is based on the concept of chi-square distribution. 
This type of distribution is used when you are dealing with the collection 
of values that include adding up squares. Chi-square distribution is not 
symmetrical and all the values are positive. You need to know the degrees of 
freedom for using the chi-square distribution. The chi-square test is used for 
judging the significance of difference between observed (O) and expected 
(E) frequencies. The generalized shape of y2 distribution depends on the 
degree of freedom and the y2 (chi) is written as, 


£ (0,-F,)" 
2. oe 1 i 
The chi-square test thus calculates the probability that no significant 
difference seems to exist between the expected frequency of an occurrence 
and the observed frequency of the same occurrence. The chi-square testing 
can be classified further as follows: 


e Chi-square goodness of fit test: This test is used for performing a 
comparison between a theoretical distribution and the observed data 
from a sample. As the name implies, it tests the fit between a theoretical 
frequency distribution and a frequency distribution of observed data. 


e Chi-square test of association: The chi-square test of association 
facilitates comparing two attributes in a sample data. The comparison 
enables the researcher to determine whether there exists any relation 
between the two attributes. 


e Chi-square test of homogeneity: Here, the test is concerned with 
determining whether two populations have the same proportion of 
observations with a common characteristic. 


Variance of samples requires adding a collection of squared quantities 
and thus having distribution that is related to chi-square distribution. The 
Chi-square distribution is a mathematical distribution that is used directly 
or indirectly in many tests of significance. The most common use of the 
chi-square distribution is to test differences among proportions. Figure 12.3 
shows the chi-square distribution. 


Test of Hypothesis 
<4 degree of freedom 


NOTES 


om <— 5 degree of freedom 
Frequency/probability 


Chi-square variable 


Fig. 12.3 Chi-square Distribution 


The variance of the chi-square distribution is represented by: 


y- Mi =X 
n-l1 
where 


x, = observation of the sample 


X= mean of the sample 

n= size of the sample 

The chi-square distribution is represented as: 
n xX _ X 2 
2 "AY ng? 


ASS z Er with (n—1) degrees of freedom. 


In the formula of chi-square distribution the variance of the distribution 
is represented as ©. This a random sample and from the normal population the 
size is taken with the variance that is known as chi-square (y’) distribution 
with (n—1) degrees of freedom. 


12.3.4 Testing the Equality of Variances of Two Normal Populations: 
F-test 


To test the equality of variances of two normal populations, the F-test is used, 
which is based on F-distribution. 


The formula for this hypothesis testing is as follows: 


2 
F= Os, 
“3 
O 


S2 
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Test of Hypothesis 


Where, o? = and 
(n,-1) 
9 
NOTES o (%i-X) 
(n, 1) 


The following presumptions are made while using F-test: 
e The population samples are normal. 

e Samples have been drawn randomly. 

e Observations are independent. 

e There is no measurement error. 


To test the hypothesis, whether the two samples are from the same 
normal population with equal variance or from two normal populations with 
equal variances, the objects of F-test are used. It is also used to verify the 
hypothesis of equality between two variances. But, this test is now mostly 
used in the analysis of variance. 


The F-test depends on F distribution, which is an asymmetric 
distribution that has a minimum value of 0, but no maximum value. The curve 
reaches a peak not far to the right of 0, and then approaches the horizontal 
axis. Figure 12.4 shows the F distribution. 


Frequency/probability 


F 
Fig. 12.4 F Distribution 
In F distribution, (n,—1) and (n, —1) are the degrees of the freedom of 
the F distribution, which is represented as: 
(n =87 he? 
n -1 with (n,—1) and (n, —1) are the degrees of freedom. 
(n, -1)S> / 05 


n, -1 


2 
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If 0, = O;, then the formula of F distribution will be represented as: Test of Hypothesis 
2 
F= a with (n,—1) and (n, —1) are the degrees of the freedom. 
S; 


Hypothesis testing of correlation coefficients NOTES 


To know the significance of correlation coefficient based on sample data, the 
following formulae are applied: 


l . . [n-2 T 
e For simple correlation coefficient: t = r,, =e , null hypothesis is 
yx 


either accepted or rejected based on the value of t. 


e For partial correlation coefficient: t= r, o}, null hypothesis is 
=r 


p 


either accepted or rejected based on the value of t. 


12.4 STATISTICAL TECHNIQUES OF HYPOTHESIS 
TESTING 


The statistical techniques of hypothesis testing involves proving a statement 
that acts as an alternative for the hypothesis. There are several types of 
statistical techniques employed in hypothesis testing which can be explained 
with suitable examples. 


(a) Hypothesis testing of means 


There are different situations under hypothesis testing of means. The testing 
technique is different in different situations, which are as follows: 


e If population is normal, population is infinite, sample size that can 
be large or small and the variance of population is known, then, Ha 
may be one-sided or two-sided. In such a situation, z-test is used and 
formula for this is as follows: 


_ X-Hy, 
o,/ Vn 
e If population is normal, population is finite, sample size that can be 


large or small and the variance of population is known, then, Ha may 
be one-sided or two-sided. 


Z 


The formula for this situation is as follows: 
X= Hy 


[A nS] 
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Test of Hypothesis e If population is normal, population is infinite, sample size is small and 
variance of the population is unknown, then, H, may be one-sided or 
two-sided. 


The formula for this situation is as follows: 
X- 
= Hey, 
(c,/vn) x | (N —ny(N — D| 


NOTES 


and 


If population is normal, population is finite, sample size is small and 
variance of the population is unknown, then, H, may be one-sided or 
two-sided. 


The formula for this situation is as follows: 


X Hp, EX -X? 

Sa a 
aa MAN pa 
If population is not normal but sample size is large, variance of the 
population is known or unknown, then, H, may be one-sided or two- 
sided. 


The formula for this situation is as follows: 


B X — be, 


z= X ~ Hy, 


OR| z=- = 
Op ve (o,/Vn ON —n)/(N -1) 


(b) Hypothesis testing for difference between means 


There are situations where the significance of difference between the two 
means is examined. Such situations are stated along with their respective 
formulas: 


e Population variances are known or the samples happen to be large 
samples. The formula used under this situation is as follows: 


X, -X, 

Z= 1 2 
2 2 
Op + Op 

n n 


In case O and ©, are not known, ©, and ©, are used in the same 
formula, which can be rewritten as: 


e Samples happen to be large but presumed to have been drawn from 
the same population whose variance is known. 
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The formula used under this situation is as follows: Test of Hypothesis 


X,-X, 


In case © is not known, os, , (combined standard deviation of both the 
samples) is used. The formula thus obtained after replacing it is as follows: 


pE +D?) +n, (02 +D?) 


n +n 


2 
Where D = X,—X1> and D,= pre ee 


n,Xitn,X2 
ntn, 


and X= 


e Samples happen to be small and population variances are not known 
but assumed to be equal. 


The formula for this situation is as under: 
X-X 


DXi -%) 2% -X) S) Jam 


X-X 


par +(m-Yo, [T 


n,+n, —2 n m 


t= 


12.5 CHI-SQUARE TEST AND CONTINGENCY 
TABLE 


From the observation (of data), different statistics are constructed to 
estimate the population parameters. In general (but not always) the sampling 
distribution of these statistics depends on the parameters and form of the 
parent population. The difference between distributions have been previously 
studied through constants like mean, standard deviation, etc., which are the 
estimates of the parameters, but generally these do not give all the features 
of these distributions. This caused the necessity to have some index which 
can measure the degrees of difference between the actual frequencies of the 
various groups and can thus compare all necessary features of them. An index 
of this type is “Karl Pearson’s x? (chi-square)? which is used to measure 


the deviations of observed frequencies in an experiment from the expected 
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frequencies obtained from some hypothetical universe. Here, we are going 
to study a distribution called y?-distribution which enables us to compare a 
whole set of sample values with a corresponding set of hypothetical values. 


x? distribution was discovered in 1875 by Helmert and was again 
discovered independently in 1900 by Karl Pearson who applied it as a test 
of goodness of fit. 


Definition of y? (Or Chi-square) 


Iff and f, denote the observed and corresponding expected frequencies of a 
class interval (or cell), then chi-square is defined by the relation: 


where the summation extends to the whole set of class-intervals. 


Another form of x? is obtained as follows: 


eas (fo- fe) -$ f24+f2-2.f,f, 
X f, 


x Bat, ‘eal 


ae he fo- 25 tl 
($an -2N 
= fe 
(Uf = Uf, =N, the total frequency) 


f2 
O 
X208 A2) ene 


Note: It can be proved that if x, x,, eee, x, be in n independent normal variates each 
having zero mean and unit variance, then the sum of the squares of these n variates 


i.e., (7 4xXB+...4%, n) i is a Statistic called y? with ‘n’ degrees of freedom or a stochastic 


variate on %2 distribution with ‘n’ degrees of freedom (The no. of independent 
variates is called the no. of degrees of freedom). 


Degrees of Freedom and Constraints 


Let the individuals of a sample be grouped into ‘n’ classes or cells but instead 
of these being independent, let those be subject to ‘v’ independent linear 
constraints, then the no. of degrees of freedom n is defined by the relation 
v=(n—-c). 


degrees no. of no. of linear 
1.€., | of freedom | 7 | groups constraints 


Note: Each independent linear constraint reduces the no. of degrees of freedom by Test of Hypothesis 
one. 


The Chi-square Distribution 


For large sample size, the sampling (probability) distribution of y* can NOTES 
be closely approximated by a continuous curve known as the chi-square 
distribution. 


The probability function of x? distribution is given by 
FOA) — ee , el 


1 1 : 
where e= T+ tap tad inf 


= 2.71828 
v = no. of degrees of freedom 
c =a constant depending only on v. 


The chi-square distribution has only one parameter v, the no. of degrees 
of freedom. This is similar to the case of the ¢-distribution. Hence f (%2) is a 
family of distributions, one for each value of v. 


Important Properties of the Distribution 


(i) x’ distribution is a continuous probability distribution which has the 
value zero at its lower limit and extends to infinity in the positive 
direction. Negative value of x? is not possible (since the differences 
between the observed and expected frequencies are always squared). 


(ii) The exact shape of distribution depends upon the no. of degrees of 
freedom v. For different values of v, we shall have different shapes 
of distribution. In general, when v is small, the shape of the curve 
is skewed to the right and as v gets larger, the distribution becomes 
more and more symmetrical and can be approximated by the normal 
distribution. 


(iii) The mean of the y? distribution is given by the degrees of freedom i.e., 
E(x) = v and variance is twice the degrees of freedom, i.e., V(y”) = 2v. 


(iv) As v gets larger, y? approaches the normal distribution with mean v 
and standard deviation J2v. In practice, it has been determined that 
the quantity 2,” provides a better approximation to normality than 
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x? itself for values of 30 or more. The distribution of y2x? has a mean 
equal to /(2v—1) and a standard deviation equal to 1. 
(v) The sum of independent y? variates is also a y? variate. Therefore, 
ee ae i : 2, 
if X1 is a x? variate with v, degrees of freedom and *2 is another %? 
2 2 
variate with v, degrees of freedom, then their sum (x4 +x) is also a 
y? variate with (v, + v,) degrees of freedom. This property is known 
as the additive property of y’. 


Contingency Table 


Let the given data be classified into p classes A,, A,, eee, Ap according to attribute 
x and into q classes B,, B,, ee, B, according to attribute y. Let f, devote the 
observed frequency of the call belononé to both the classes A, (i= 1,2, eee, p) 
and B, G = 1, 2, eee, q). 


Let the total of all the frequencies belonging to the class 4, be denoted 
by (4,) and similarly let (B ) denote the total of all the frequencies belonging 
to the class B. Then the given data can be set into a table of r rows and s 
columns in the following manner: 


Classes B B ecco B eco B B, Totals 


1 2 j q- 


OA ho ho l" fo l fa Ae D 
A, Jai Ts coo Íy oco A j Ja (A,) 


A, Ta ie eco i coo Íza Ía (4) 
Aa Je 1 Jei 2 eee as eee J q-1 Jra q e) 
A, Sa Ía ne iy Eiig h, ql Soy (A,) 


Totals (B) (B) eco (B) eee (Bq) (Ba) N. 


Calculation of v for Contingency Table 


The theoretical frequencies in a contingency table are calculated by imposing 
the limitations that the row totals, column totals and the grand total remain 
constant (i.e., unchanged). Therefore, if there be p rows and q columns, then 
each of p row totals and q column-—totals gives rise to one constraint and so 
we have (p + q) constraints. But, the sum of the border rows and the sum of 
the border columns must each be equal to the grand total and so one constraint 
is diminished and so there are only (p + q — 1) constraints. 


So, the no. of degrees of freedom is given as follows: Test of Hypothesis 


v= (n-c) 
= [pq=(p +4-1)]=[pq4-p-4+ 1] 
=[p(4-1)-(4-1)]=(p-1)(4-1) NOTES 


Check Your Progress 


3. What is a parametric test? 


4. What is chi-square distribution? 


12.6 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. Hypothesis testing can be broadly divided into two types, which are 
as follows: 


e Parametric tests or standard tests of hypothesis 
e Non-parametric tests or distribution-free tests of hypothesis 


2. Fisher—Irwin test is applied where there is no difference between two 
sets of data. 


3. A parametric statistical test is one that makes assumptions about the 
parameters (defining properties) of the population distribution(s) from 
which one’s data are drawn. 


4. For large sample size, the sampling (probability) distribution of x? can 
be closely approximated by a continuous curve known as the chi-square 
distribution. 


12.7 SUMMARY 


e The first step of the testing procedure is to establish the hypothesis 
to be tested. As it is known, these statistical hypotheses are generally 
assumptions about the value of the population parameter; the hypothesis 
specifies a single value or a range of values for two different hypotheses 
rather than constructing a single hypothesis. 


The two hypotheses are generally referred to as the (1) null hypotheses 
denoted by H, and (2) alternative hypothesis denoted by H.. 


The hypothesis may be rejected or accepted depending upon whether 
the value of the test statistic falls in the rejection or the acceptance 
region. 

e Hypothesis testing can be broadly divided into two types, which are 


as follows: , 
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o Parametric tests or standard tests of hypothesis 
o Non-parametric tests or distribution-free tests of hypothesis 


e Parametric tests assume certain properties of the population sample 
such as observations from a normal population, large sample size, 
population parameters like mean and variance. 


e McNamara test is applied where the data is nominal in nature and is 
related to two interrelated samples. 


The method for comparing two related samples in hypothesis testing 
is the paired t-test. 


The statistical techniques of hypothesis testing involves proving a 
statement that acts as an alternative for the hypothesis. 


From the observation (of data), different statistics are constructed to 
estimate the population parameters. In general (but not always) the 
sampling distribution of these statistics depends on the parameters and 
form of the parent population. 


The theoretical frequencies in a contingency table are calculated by 
imposing the limitations that the row totals, column totals and the grand 
total remain constant (i.e., unchanged). 


12.8 KEY WORDS 


e T-Test: It is any statistical hypothesis test in which the test statistic 
follows a Student’s t-distribution under the null hypothesis. 


e Z-Test: It is any statistical test for which the distribution of the test 
statistic under the null hypothesis can be approximated by a normal 
distribution 


e F-Test: It is any statistical test in which the test statistic has an 
F-distribution under the null hypothesis. 


e Chi-Test: It is any statistical hypothesis test where the sampling 
distribution of the test statistic is a chi-squared distribution when the 
null hypothesis is true. 


12.9 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 


1. What are non-parametric tests? 
2. What is the F-test? 
3. How is chi-square testing classified? 


4. List the important properties of chi-square distribution. 


Long-Answer Questions 


1. Describe the various steps of the testing procedure. 
2. Illustrate how to test the difference between two population means. 
3. Examine the hypothesis test for comparing two related terms. 


4. Describe the chi-square test. 


12.10 FURTHER READINGS 


Creswell, John W. 2002. Research Design: Qualitative, Quantitative, and 
Mixed Methods Approaches. London: Sage Publications Inc. 


Booth, Wayne, Gregory G. Colomb and Joseph M. Williams. 1995. The Craft 
of Research. Chicago: University of Chicago Press. 

Bryman, Alan and Emma Bell. 2015. Business Research Methods. 4th Edition. 
United Kingdom: Oxford University Press. 


Gupta, S.L. and Hitesh Gupta. 2012. Business Research Methods. New Delhi: 
Tata McGraw Hill Education Private Limited. 


Test of Hypothesis 


NOTES 


Self-Instructional 
Material 


213 


Overview of 
Non-Parametric Tests 


214 


NOTES 


Self-Instructional 
Material 


UNIT 13 OVERVIEW OF 


NON-PARAMETRIC TESTS 


Structure 


13.0 Introduction 
13.1 Objectives 
13.2 Non-Parametric Test: Concept and Types 
13.2.1 Mann-Whitney Test 
13.2.2 Kruskal Wallis 
13.2.3 Sign Test 
13.3 Multivariate Analysis 
13.3.1 Factor Analysis 
13.3.2 Cluster 
13.3.3 Multidimensional Scaling (MDS) 
13.3.4 Discriminant Analysis 
13.4 The Process of Interpretation of Test Results 
13.4.1 Guidelines for Making Valid Interpretation 
13.5 Answers to Check Your Progress Questions 
13.6 Summary 
13.7 Key Words 
13.8 Self Assessment Questions and Exercises 
13.9 Further Readings 


13.0 INTRODUCTION 


In the previous unit, you primarily learnt about the different parametric tests. 
In this unit, we will discuss non-parametric tests. As we have learnt, non- 
parametric tests are sometimes called distribution-free tests because they 
are based on fewer assumptions (e.g., they do not assume that the outcome 
is approximately normally distributed). The cost of fewer assumptions is 
that nonparametric tests are generally less powerful than their parametric 
counterparts (i.e., when the alternative is true, they may be less likely to 
reject H,). We will discuss the different types of non-parametric tests in this 


unit. The concept of multivariate analysis is also explained here. 


13.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Discuss the different types of non-parametric tests 
e Examine multivariate analysis 


e Describe the process of interpreting tests results 


Overview of 


13.2 NON-PARAMETRIC TEST: CONCEPT AND Non-Parametric Tests 


TYPES 
As you have learnt, there are situations where assumptions cannot be made. NOTES 


In such situations, different statistical methods are used, which are known as 
non-parametric tests. Thus, we can say that a non-parametric test is a test that 
does not assume anything about the underlying distribution. It is sometimes 
called a distribution free test. There are various types of non-parametric tests. 
These include: e 


e Sign test: They include one-sample sign test and two-sample sign test. 
e Fisher—Irwin test 
e McNamara test 


e Wilcoxon matched-pairs test 
13.2.1 Mann-Whitney Test 


This test was developed by H B Mann and R Whitney in the 1940s. The test 
is used to examine whether two samples have been drawn from populations 
with same locations (mean). The application of a t test involves the assumption 
that the samples are drawn from the normal population. If the normality 
assumption is violated, this test can be used as an alternative to a t test. This 
is a very powerful non-parametric test as this can be used both for qualitative 
and quantitative data. A two tailed hypothesis for a Mann-Whitney test could 
be written as: 


H,: Two samples come from identical populations 


or 
Two populations have identical probability distribution. 


H,: Two samples come from different populations 


or 
Two populations differ in locations. 


The procedure involved in the use of Mann-Whitney U test is very 
simple and is described in the following steps: 


(i) The two samples are combined (pooled) into one large sample and 
then we determine the rank of each observation in the pooled sample. 
If two or more sample values in the pooled samples are identical, i.e., 
if there are ties, the sample values are each assigned a rank equal to 
the mean of the ranks that would otherwise be assigned. 


(ii) We determine the sum of the ranks of each sample. Let R, and R, 
represent the sum of the ranks of the first and the second sample whereas 
n, and n, are the respective sample sizes of the first and the second 
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sample. For convenience, choose nl as a small size if they are unequal 
so that n, < n,. A significant difference between R, and R, implies a 
significant difference between the samples. 


- n,(n,+1) 
Gii) Define U,=n,n,4+ = +B, 
+] 
and = =U,=n,n,+ a JL R, 
Please note that the following expression will hold true: 
U, = U, E nmn, 


Mann-Whitney test for a large sample: If n, or n, is greater than 10, a large 
sample approximation can be used for the distribution of the Mann-Whitney 
U statistic. For this purpose, either of U, or U, could be used for testing a 
one-tailed or a two-tailed test. In this test, U, will be used for the purpose. 


Under the assumption that the null hypothesis is true, the U, statistic 
follows an approximately normal distribution with mean: 


= nn, 
Ua 2 
z (n, +n, + 1) 
O, = 
2 12 
The test statistic is: 
U -H,, 
z= 
(0) 


Uy 


Assuming the level of significance as equal to a, if the absolute sample 
value of Z is greater than the absolute critical value of Z, i.e., Z» the null 
hypothesis is rejected. A similar procedure is used for a one tailed test. For 
a one sided upper tail test if the sample value of Z is greater than the critical 
Z the null hypothesis is rejected. For a one-sided lower tail test, the null 
hypothesis is rejected if the sample Z is less than -Z 


13.2.2 Kruskal Wallis 


One of the assumptions used in the ANOVA technique is that all the involved 
populations from where the samples are taken are normally distributed. If 
this assumption does not hold true, the F-statistic used in ANOVA becomes 
invalid. The normality assumptions may not hold true when we are dealing 
with ordinal data or when the size of the sample is very small. 


The Kruskal-Wallis test comes to our rescue during such situations. 
This is, in fact, a non-parametric counterpart to the one-way ANOVA. The 
test is an extension of the Mann-Whitney U test discussed earlier. Both 
methods require that the scale of the measurement of a sample value should 
be at least ordinal. 


The hypothesis to be tested in-Kruskal-Wallis test is: Overview of 
2. oo : Non-Parametric Tests 
H, : The k populations have identical probability distribution. 
H, : At least two of the populations differ in locations. 
The procedure for the test is listed below: NOTES 
(i) Obtain random samples of size n,, ..., nk from each of the k populations. 
Therefore, the total sample size is n = n, +n, +... +n, 


(ii) Pool all the samples and rank them, with the lowest score receiving a 
rank of 1. Ties are to be treated in the usual fashion by assigning an 
average rank to the tied positions. 

(iii) Let r, = the total of the ranks from the i sample. 


The Kruskal-Wallis test uses the y’ to test the null hypothesis. The test 
statistic is given by: 


12 $ f 
H= -) L-3(n+1), 
n(n+ 1) n; l } 


which follows a y’ distribution with the k-1 degrees of freedom. 
where, k= Number of samples 
n = Total number of elements in k samples. 


The null hypothesis is rejected, if the computed %? is greater than the 
critical value of y’ at the level of significance a. 


13.2.3 Sign Test 


The Mann-Whitney U test just discussed assumes that the two samples are 
independent. However, there are instances when the normality assumption 
is not satisfied and one has to resort to a non-parametric test. One such test 
earlier discussed was the two-sample sign test. In this test, only the sign of 
the difference (positive or negative) was taken into account and no weightage 
was assigned to the magnitude of the difference. The Wilcoxon matched-pair 
signed rank test takes care of this limitation and attaches a greater weightage 
to the matched pair with a larger difference. The test, therefore, incorporates 
and makes use of more information than the sign test. This is, therefore, a 
more powerful test than the sign test. 


The test procedure is outlined in the following steps: 


(i) Let di denote the difference in the score for the i™ matched pair. Retain 
signs, but discard any pair for which d = 0. 


(ii) Ignoring the signs of difference, rank all the di’s from the lowest to 
highest. In case the differences have the same numerical values, assign 
to them the mean of the ranks involved in the tie. 


(iii) To each rank, prefix the sign of the difference. 
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(iv) Compute the sum of the absolute value of the negative and the positive 
ranks to be denoted as T— and T+ respectively. 


(v) Let T be the smaller of the two sums found in step iv. 
When the number of the pairs of observation (n) for which the difference 
is not zero is greater than 15, the T statistic follows an approximate normal 


distribution under the null hypothesis, that the population differences are 
centred at 0. The mean iL, and standard deviation 6, of T are given by: 


n(n+1) n(n +1)(2n + 1) 
H= and o= AE 


The test statistic is given by: 


7 _ nin +) 
Z= Se 
EE 1)(2n + 1) 

24 


For a given level of significance œ, the absolute sample Z should be 
greater than the absolute Z „ to reject the null hypothesis. For a one-sided 
upper tail test, the null hypothesis is rejected if the sample Z is greater than 
Zand for a one-sided lower tail test, the null hypothesis is rejected if sample 
Z is less than — Z; 


13.3 MULTIVARIATE ANALYSIS 


Let us now discuss multivariate analysis. 
13.3.1 Factor Analysis 


Factor analysis is a multivariate statistical technique in which there is no 
distinction between dependent and independent variables. In factor analysis, 
all variables under investigation are analysed together to extract the underlined 
factors. Factor analysis is a data reduction method. It is a very useful method 
to reduce a large number of variables resulting in data complexity to a few 
manageable factors. These factors explain most part of the variations of the 
original set of data. A market researcher might have collected data on say, 
more than 50 attributes (or items) of a product which may become very 
difficult to analyse. Factor analysis could help to reduce the data on 50 odd 
attributes to a few manageable factors. It helps in identifying the underlying 
structure of the data. 


A factor is a linear combination of variables. It is a construct that is not 
directly observable but that needs to be inferred from the input variables. The 
factors are statistically independent. We will show you their application in a 


regression analysis as the factor scores, when used as independent variables Overview of 
š z ‘ f . g Non-Parametric Tests 
in regression analysis, help to solve the problem of multicollinearity. (The 

problem of multicollinearity in a regression model arises when the independent 
variables are so highly correlated that it becomes difficult to separate out the 
influence of each of the independent variables on the dependent variable.) 
The factor scores could also be used in other multivariate techniques. 


NOTES 


Uses of Factor Analysis 


The technique of factor analysis has multiple uses as discussed in the 
following situations: 


Scale construction: Factor analysis could be used to develop concise multiple 
item scales for measuring various constructs. We have already discussed in 
the chapter Attitude Measurement and Scaling the process of developing 
a multiple item scale that typically starts generating a large set of items 
(statements) relating to the attitude being measured. This is done as part of 
exploratory research. Factor analysis can reduce the set of statements to a 
concise instrument and at the same time, ensure that the retained statements 
adequately represent the critical aspects of the constructs being measured. 
Suppose we want to prepare a multiple item scale for measuring the job 
satisfaction of skilled workers in an organization. As the first step, we would 
generate a large number of statements, numbering say 100 or so as part of 
exploratory research. These statements could be subjected to factor analysis 
and let us assume that we get three factors out of it. Now, if we want to 
construct a 15-item scale to measure job satisfaction, what could be done is 
to separate five items in each of the factors having the highest factor loading. 
The concept of factor loading will be discussed later in the book. This way, 
a 15-item scale to measure job satisfaction could be developed. 


Establish antecedents: This method reduces multiple input variables into 
grouped factors. Thus, the independent variables can be grouped into broad 
factors. For example, all the variables that measure the safety clauses in 
a mutual fund could be reduced to a factor called safety clause. Thus, the 
company could know about the broad benefit that an investor seeks in a fund. 


Psychographic profiling: Different independent variables are grouped to 
measure independent factors. These are then used for identifying personality 
types. One of the most well-known inventories based on this technique is 
called the 16 PF inventory. 


Segmentation analysis: Factor analysis could also be used for segmentation. 
For example, there could be different sets of two-wheelers-customers owning 
two wheelers because of different importance they give to factors like prestige, 
economy consideration and functional features. 


Marketing studies: The technique has extensive use in the field of marketing 
and can be successfully used for new product development; product 
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acceptance research, developing of advertising copy, pricing studies and for 
branding studies. For example we can use it to: 


e identify the attributes of brands that influence consumers’ choice; 
e get an insight into the media habits of various consumers; 


e identify the characteristics of price-sensitive customers. 
Conditions for a Factor Analysis Exercise 


Factor analysis requires some specific conditions that must be ensured before 
executing the technique. These are mentioned in detail in this section. 


e Factor analysis exercise requires metric data. This means the data 
should be either interval or ratio scale in nature. The variables for factor 
analysis are identified through exploratory research which may be 
conducted by reviewing the literature on the subject, researches carried 
out already in this area, by informal interviews of knowledgeable 
persons, qualitative analysis like focus group discussions held with 
a small sample of the respondent population, analysis of case studies 
and judgement of the researcher. Generally in a survey research, a five 
or seven-point Likert scale or any other interval scales may be used. 


As the responses to different statements are obtained through different 
scales, all the responses need to be standardized. The standardization 
helps in comparison of different responses from such scales. The 
standardization is carried out using the following formulae: 
Standardized score of ith respondent on a statement = 


Actual score of i respondent on statement - Mean of all respondents on the statement 
Standard deviation of all respondents on the statement 


The size of the sample respondents should be at least four to five times 
more than the number of variables (number of statements). 


The basic principle behind the application of factor analysis is that the 
initial set of variables should be highly correlated. If the correlation 
coefficients between all the variables are small, factor analysis may not 
be an appropriate technique. A correlation matrix of the variables could 
be computed and tested for its statistical significance. The hypothesis 
to be tested may be written as: 


H, : Correlation matrix is insignificant, i.e., correlation matrix is 
an identity matrix where diagonal elements are one and off 
diagonal elements are zero. 


H, : Correlation matrix is significant. 


The test is carried out by using a Bartlett test of sphericity, which takes 
the determinant of the correlation matrix into consideration. The test 
converts it into a chi-square statistics with degrees of freedom equal 


to [(k(k-1))/2], where k is the number of variables on which factor Overview of 
ea ; 44 Sa : . Non-Parametric Tests 

analysis is applied. The significance of the correlation matrix ensures 

that a factor analysis exercise could be carried out. 


Another condition which needs to be fulfilled before a factor analysis 
could be carried out is the value of Kaiser-Meyer-Olkin (KMO) 
statistics which takes a value between 0 and 1. For the application of 
factor analysis, the value of KMO statistics should be greater than 0.5. 
The KMO statistics compares the magnitude of observed correlation 
coefficients with the magnitudes of partial correlation coefficients. A 
small value of KMO shows that correlation between variables cannot 
be explained by other variables. 


NOTES 


Steps in a Factor Analysis Exercise 


There are basically two steps that are required in a factor analysis exercise. 


1. Extraction of factors: The first and the foremost step is to decide on 
how many factors are to be extracted from the given set of data. This 
could be accomplished by various methods like the centroid method, 
the principal component method and the maximum likelihood method. 
Here, only the principal component method will be discussed very 
briefly. As we know that factors are linear combinations of the variables 
which are supposed to be highly correlated, the mathematical form of 
the same could be written as: 


F, = Wis + Wx" + Ws” + Kio + WX, 
where, 


X.*= i" standardized variable 


1 


F, = Estimate of i® factor 

W, = Weight or factor score coefficient for i® standardized 
variable. 

k = Number of variables 


The principal component methodology involves searching for those 
values of W, so that the first factor explains the largest portion of 
total variance. This is called the first principal factor. This explained 
variance is then subtracted from the original input matrix so as to 
yield a residual matrix. A second principal factor is extracted from the 
residual matrix in a way such that the second factor takes care of most 
of the residual variance. One point that has to be kept in mind is that 
the second principal factor has to be statistically independent of the 
first principal factor. The same principle is then repeated until there 
is little variance to be explained. Theory may be used to specify how 
many factors should be extracted or it may be based on the criterion 
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of the Kaiser Guttman method. This method states that the number of 
factors to be extracted should be equal to the number of factors having 
an eigenvalue of at least 1. Since each of the variables in the original 
data set has a variance of | (eigenvalue of 1), therefore, if there are 50 
variables then the total variation in the data set will be 50. 


We know that a factor is a linear combination of the various variables. 
Now eigenvalue for each of the factor is computed and only those 
factors that have an eigenvalue at least 1 are accepted as per Kaiser 
Guttman method. All those factors having eigenvalues less than 1 are 
rejected. This is because each of the variables has a variance of 1 and, 
therefore, a linear combination of these variables called factor should 
not have an eigenvalue less than 1. 


Another output of the factor analysis exercise is a factor score, 
which is computed for each of the factors corresponding to each 
respondent. Most software, including SPSS, provide factor score for 
each respondent and each factor. As the factor scores are statistically 
independent, they can be used in regression and discriminant analysis 
as independent variables. This will be explained briefly in the text later 
on. 


The correlation coefficient of the extracted factor score with a variable 
is called the factor loading. In most computer printouts, a matrix of 
factor loadings called factor matrix or component matrix is presented. 
Factor loadings play a very important role in the computations of 
eigenvalues of each factor and also in computing the communalities 
of each variable. These concepts would be discussed in depth with the 
help of a numerical exercise. 


. Rotation of factors: The second step in the factor analysis exercise is 


the rotation of initial factor solutions. This is because the initial factors 
are very difficult to interpret. Therefore, the initial solution is rotated so 
as to yield a solution that can be interpreted easily. Most of the computer 
software would give options for orthogonal rotation, varimax rotation 
and oblique rotation. Generally, the varimax rotation is used as this 
results in independent factors. The varimax rotation method maximizes 
the variance of the loadings within each factor. The variance of the 
factor is largest when its smallest loading tends towards zero and its 
largest loading tends towards unity. The basic idea of rotation is to get 
some factors that have a few variables that correlate high with that 
factor and some that correlate poorly with that factor. Similarly, there 
are other factors that correlate high with those variables with which 
the other factors do not have significant correlation. Therefore, the 
rotation is carried out in such way so that the factor loadings as in the 


first step are close to unity or zero. This procedure avoids problems of Overview of 
having factors with all variables having midrange correlations. This is all 
done for a better interpretation of the results and for the ease obtained 
in naming the factors. Once this is done, a cut-off point on the factor 
loading is selected. There is no hard and fast rule to decide on the 
cut-off point. However, generally it is taken to be greater than 0.5. All 
those variables attached to a factor, once the cut-off point is decided, 
are used for naming the factors. This is a very subjective procedure 
and different researchers may name same factors differently. Another 
point to be noted is that a variable which appears in one factor should 
not appear in any other factor. This means that a variable should have 
a high loading only on one factor and a low loading on other factors. If 
that is not the case, it implies that the question has not been understood 
properly by the respondent or it may not have been phrased clearly. 
Another possible cause could be that the respondent may have more 
than one opinion about a given item (statement). 


NOTES 


The total variance explained by all the factors taken together remains 
the same after rotation. However, the amount of variations for each individual 
factor may undergo a change. The communalities for each variable under the 
two procedures remain unchanged. 


13.3.2 Cluster 


Cluster analysis is a grouping technique. The basic assumption underlying 
the technique is the fact that similarity is based on multiple variables, and 
the technique attempts to measure the proximity in terms of the study 
variables. The emerging groups are homogenous in their composition and 
heterogeneous as compared to the other groups. The grouping can be done 
for objects, individuals, entities and products. The researcher identifies a 
set of clustering variables which have been assumed as significant for the 
purpose of classifying the objects into groups. Thus, it is also referred to 
as a Classification technique, numerical taxonomy and Q analysis. This is 
basically because the technique is used in various branches of social science, 
like psychology, sociology, engineering and management. If one were to plot 
the groups geometrically, a robust cluster analysis is one where individual 
objects in one cluster are concentrated together and where the individual 
clusters are far apart from each other. Figure 13.1(a) shows a simple cluster 
solution of breakfast food based on people who seek nutrition and convenience 
(ease of preparation). 
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Fig 13.1(b) Actual Cluster Solution 


However, the actual situation might be different as the person might 
be using different criteria for a weekday and for a weekend breakfast. Thus, 
as the criteria for decision-making become multiple, the grouping does not 
happen on a simple two-dimensional space but becomes multidimensional 
[Figure 13.1(b)]. Thus, the researcher is able to group people on these three 
dimensions and the point regarding the interpretation of benefits sought 
becomes clear as one understands the multidimensionality of needs. Thus, a 
bakery/confectionery shop selling sandwiches, patties, bread rolls as well as 
freshly ground idli batter, using the solution would know: (1) the lucrative 
segment, (2) the segment which might be motivated to buy if one takes care 
of their weekday/weekend needs, and (3) A segment which is currently not 
interested in getting a ‘ready-to-eat’ breakfast solution and might not look at 
the bakery as an outlet to visit in the morning. Once the homogenous clusters 


emerge, the next step is to determine the profile of the group in terms of eo Cre of 
who they are? What is their gender, age group, family size, etc.? What deals ee aes 
motivate them to buy from a particular store when they are buying eatables 


in general? 
NOTES 
Differentiating Cluster Analysis 


In terms of the nature of the technique vis-a-vis the other multivariate 
techniques, cluster analysis is similar in terms of analysing the function of 
multiple independent variables. However, there are essential differences 
between the other data reduction techniques and cluster analysis. 


In factor analysis, the objective was to reduce the original correlated 
variables to a more manageable number of orthogonal or oblique factors. 
However, the data reduction was carried out on the columns of the data 
matrix. On the other hand, in cluster analysis the focus is on the rows, or 
the individuals or entities and the objective is to group the individuals on 
the variables. 


The other data classification technique is the two group discriminant 
analyses. Here also, one might wish to group individuals or objects into 
groups, but the classification or identification of groups is a priori. Thus, 
in the technique one has an established classification rule and the objective 
of the technique is to validate the information to attest whether the groups 
obtained by the identified function are correctly classified or not. In cluster 
analysis, the whole population/sample is undifferentiated and the attempts 
to assess similarity in response to variables and the grouping happens post 
the clustering. 


Usage of Cluster Analysis 


Cluster analysis has widespread applicability in all the branches of social 
sciences and management. In management science, its most valuable 
contribution is in the area of marketing, especially market segmentation. 
Some applications of the technique are as follows: 


e Market segmentation: As we know, Market segmentation is the 
process of splitting customers/potential customers, within a market 
into different groups/segments, where customers have the same/similar 
requirement satisfied by a distinct marketing mix (McDonald and 
Dunbar, 1998). This is one area that has seen maximum theorization 
on the basis of the outputs of the technique. Some examples are 
ACORN (A classification of residential neighbourhood based on 40 
variables, e.g., house/car ownership, employment, religion, lifestyle, 
etc.), PRIZM (Potential rating index by zip market. This is based on 
39 variables (for example, education, affluence, family life cycle, 
urbanization, race and ethnicity, mobility, etc.). The solution provides 
62 lifestyle categories. The advantage with the technique is that one can 
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look at the combination of variables to predict consumer or potential 
consumer groups. The best example of clustered solutions are in the 
area of benefit segmentation (Haley, 1968). Here, the consumers are 
divided into groups based on the benefits they seek from the product 
category. These, then, could be across age groups, gender and other 
variables. Thus, a marketer could design his product on the basis of 
this segmentation approach. Yankelovich (1964) segmented consumers 
in terms of ‘what they look for in a watch’ and classified people into 
those who are price driven, durability and quality driven, and those 
driven by occasion-bound symbolism. Sinha (2003) classified food 
shoppers into fun and work shoppers based on the benefits they seek 
from grocery/food purchase. Sondhi and Singhvi (2005) classified 
grocery shoppers into transition shoppers, traditional shoppers, thrifty 
shoppers and indifferent shoppers. 


Segmenting industries/sectors: The researcher could also go about 
grouping products or sectors (e.g., health or education) into blocks 
that have some common trait(s). This makes it easier for both the 
organizations and policy-makers while planning or evaluating the 
performance of the group. 


Segmenting markets: Cities or regions with some common traits like 
population mix, infrastructure development, climatic or socio-economic 
conditions could be clustered together. If one city in Kerala and another 
in Andhra Pradesh are in one cluster, then the organization is able to 
plan and execute a similar business approach in the two areas. 


Career planning and training analysis: In the area of human resources 
(HR) the technique can be used to group people into clusters on the basis 
of their educational qualification, experience, aptitude and aspirations. 
This grouping can assist the HR division to effectively manage training 
and manpower development for the members of different clusters 
effectively. 


Segmenting financial sectors/instruments: This is an emerging area 
where different factors like raw material cost, financial allocations, 
seasonality and other factors are being used to group sectors together to 
understand the growth and performance of a group of industries. This 
also assists the policy-makers and the financial analysts in assessing 
the monetary implications. A number of researchers are making use 
of clustering principles to group consumers and their investment 
behaviour on the basis of the combination of different variables and 
benefits sought (behavioural finance). 


The basic premise of the above technique is, as we said earlier, wherever 


a researcher wants to manage the data (especially individual or organizational) 


and he/she perceives that there could be multiple factors involved, cluster Overview of 
analysis is the best classification technique at his/her disposal. all 


Statistics Associated With Cluster Analysis 


Before we review the statistics involved with the technique, it is essential once NOTES 


again to examine the simplicity of the technique. Unlike the other multivariate 
techniques that we have discussed till now, cluster analysis is the simplest in 
terms of mathematical derivations. The simplest way to explain the technique 
is to understand that it simply measures the distance between objects on the 
basis of multiple variables and looks for similarity as a function of distance, 
i.e., the shorter the distance between two objects, the more similar they are. 


Metric data analysis: For obtaining a cluster solution to data that is collected 
on an interval or ratio scale the statistical assessment of the distance between 
two objects can be done by calculating the Euclidean distance between 
them. In case the study has two variables (as stated in the earlier example of 
nutrition and ease of preparation) then the distance between person A and B 
can be calculated: 


dap =V (Xgi - Xai)” + (Xg2 - Xa2) 


where X, represents the coordinate of person B on nutrition (interval scale 
data). 


A note of caution here: The Euclidean distance is not ‘scale invariant’. It 
may happen that the relative ordering of the objects in terms of their similarity 
can be affected by a simple change in the scale by which one or more of the 
variables are measured. Thus, it is advisable that the data is standardized 
before being subjected to any analysis. However, it may sometimes happen 
that standardization can reduce the differences between the groups on the 
variables that may well be the best discriminators of group differences. 
Thus, care needs to be taken initially in questionnaire designing to keep the 
variables measurement scales as roughly of more or less than the same range 
and avoid standardizing them. Only if the variables are measured on widely 
different units, standardization is needed to prevent the variables measured 
in larger units from dominating the cluster solution. 


In the example, the two variables were placed on a 10-point scale of 
importance (with 1 = very important and10 = very unimportant). The values 
selected by person A and B were as follows: 


Person Nutrition Ease of preparation 
A 1 2 
B 5 2 


Then the distance between the two is, 


dag = (5-1)? + (2-2) = 4.0 
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Suppose there was a third person C who had selected 
Person Nutrition Ease of preparation 
C 6 2 


Then the distance between A and C would be 5.0 and between B and 
C would be 1.0. 


Thus, B and C are the most similar pair as the inter-person distance is the 
least and, as stated earlier, the shorter the distance, the greater the similarity. 


If, in addition to having nutrition and ease of preparation for breakfast, 
we also had a variable that measured cost, we would effectively have a 
3-dimensional solution. Then the formula would have been: 


And generally, for any two objects, i and j: 


dag=V  (Xg1- Xai)” + (Xgo — Xa2)* + (Xp3 - a3)? 


where, 


dy= | K,—X,)° 
k=l 


d, = Distance between person i and j 
k = Variable (interval/ratio) 
i = Object/person 
j = Object/person 


Also, there are other distance measures available like the city-block or 
Manhattan distance between two objects, which is the sum of the absolute 
differences in the values for each variable. Another distance measure is the 
Chebychev distance between two objects, which is the maximum absolute 
difference in values for any variable. However, the most commonly used 
measure is the squared Euclidean distance. A point to be noted here is 
that clustering with squared Euclidean distance is faster than the regular 
Euclidean distance. Thus, for the purpose of clustering, we make use of 
squared Euclidean distance. The equation for this is the same as the Euclidean 
distance; only the square root is not calculated. 


Then, based on the distance calculated, a distance matrix is created and 
clusters are created by moving from the most to the least similar pair based 
on a clustering method. 


13.3.3 Multidimensional Scaling (MDS) 


The underlying presumptions that one makes while creating an MDS are: 
e The individual tries to group objects together. 


e The grouped objects are usually evaluated and compared with each 
other so that they can coexist on a spatial map. 


e The basis of evaluation is not unidimensional and the user is at all times Overview of 
š š š : T f Non-Parametric Tests 
(consciously or unconsciously) using an underlying multidimensional 
space to evaluate the objects. 


MDS essentially visually plots the perceptions and preferences of 
individuals singly and as a group, regarding a group of objects, individuals or 
both; even when the information about the dimensions or bases of evaluations 
is minimal. 


NOTES 


Thus, the technique uses powerful mathematical tools in order to 
condense the data by creating visual representations based on the similarities 
or dissimilarities of data on a spatial map (Schiffman, et al. 1981). The map 
dimensions are hypothesized to be the attributes or features that the person 
uses to form certain impressions about the object. One of the most widely 
used mathematical methods to create the maps is based on Kruskal’s (1964) 
stress calculations (to be discussed further in the chapter). 


MDS usually involves a comparison of sorts to create a relative 
position of the considered objects. The comparison could be made on defined 
dimensions, or the apparent basis of comparison. However, more often than 
not, people make use of their own peculiar and sometimes subjective or 
perceived dimensions to make the comparison. For example, it could be the 
trust or faith in the service provider in handling the insured person’s problems 
effectively. Thus, two objects or brands with the same defined dimensions 
might be perceived very differently by the person because: 


e The evaluations might not be solely based on defined or observed 
parameters. 


e The subjective and the objective dimensions might be absolutely 
unrelated. 


To simplify the process further, the technique presents the dependent 
variable (which might be a similarity or dissimilarity between the object 
or preferences) and then tries to figure out what were the underlying 
independents or antecedents that led to the obtained map. The advantage 
of this method is that the researcher’s influence where he/she attempts to 
provide the dimensions of comparison gets minimized. The disadvantage, 
however, would be to clearly figure out the dimension the respondents might 
have used for the comparison. 


Thus, the researcher needs to be fairly well versed with the probable 
parameters that a person might use for comparison. These perceived 
parameters might emerge from a qualitative analysis of the respondents’ 
decision process or through the researcher’s review of the secondary literature 
about the product. The inputs obtained would have to be objectively—without 
any element of personal bias—assessed to comprehend the defined or apparent 
and the hidden or subjective dimensions being used. 
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the respondent’s choices, let us look at a very simple example of a consumer 
who buys bread every day for his family breakfast. Now, we ask him which 
bread he buys. He tells us, ‘Harvest Gold, Britannia and Perfect.’ Next, we 
NOTES ask him the similarity between two bread brands, say, Harvest Gold and 
Britannia, on a 7-point scale, where 1 is very similar and 7 is very dissimilar. 
He says, the similarity is 1. What this means is that: 


e If we were to take a mental model of his brain when he said this, the 
two brands would be very close to each other. 


e Suppose we say that the consumer was thinking of price and availability 
when he was telling us this. Thus, the unconscious evaluation that he 
did was on the two dimensions of ‘price’ and ‘brand’. So, these two 
brand are two points close to each other in this two-dimensional map. 


e The two manufacturers have to understand that there is no brand loyalty 
from the customer, as he could very easily buy the competing brand 
as they are almost identical to each other in his ‘mind’. 


Now, suppose, we ask him if he has consumed Harvest Gold multi- 
grain bread, and he says, ‘yes’. So we now ask him to tell us the similarity 
between Harvest Gold regular and Harvest Gold multi-grain bread on the 
same 7-point scale. His answer is 6. Now, what will happen if we use the 
same dimensions as in the above case? The brand is the same for both, thus 
using a two-dimensional map would not be wise as the consumer may be 
now looking at the health benefit or nutritional content in the breads also as 
a dimension. Thus this means: 


e The bread brands now need a three-dimensional representation to 
represent their relative positioning in the consumers mind. 


e Harvest Gold multi-grain need not worry about competition with 
the other two as the consumer who buys the multi-grain will not buy 
them as a substitute as they are very different from the bread they eat 
regularly. 


MDS is only one of the wide array of statistical techniques available 
for obtaining the object map. The whole range of these methods grouped 
together is termed as perceptual mapping techniques. 


Let us now briefly attempt to understand the underlying algorithms 
of MDS. 


e The inputs obtained by the respondents could be in terms of objects, 
individuals, brands, corporations or countries. 


e The comparison could be in terms of similarities/dissimilarities, e.g. 
how similar is Delhi to Mumbai on a 7-point scale ranging from the 
most dissimilar to the most similar; or preferences, e.g. out of the five 
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least preferred. 


e As you can observe, the respondent is NOT given any dimension to 


measure similarity or dissimilarity. NOTES 


e The preferences could be based on ranked data. 
e The respondent might be asked to conduct a paired comparison of the 
data. 


13.3.4 Discriminant Analysis 


Discriminant analysis is used to predict group membership. This technique is 
used to classify individuals/objects into one of the alternative groups on the 
basis of a set of predictor variables. The dependent variable in discriminant 
analysis is categorical and on a nominal scale, whereas the independent or 
predictor variables are either interval or ratio scale in nature. When there 
are two groups (categories) of dependent variable, we have two-group 
discriminant analysis and when there are more than two groups, it is a case 
of multiple discriminant analysis. In case of two-group discriminant analysis, 
there is one discriminant function, whereas in case of multiple discriminant 
analysis, the number of functions is one less than the number of groups. 


Objectives and Uses of Discriminant Analysis 


The objectives of discriminant analysis are the following: 


e To find a linear combination of variables that discriminate between 
categories of dependent variable in the best possible manner. 


e To find out which independent variables are relatively better in 
discriminating between groups. 


e To determine the statistical significance of the discriminant function 
and whether any statistical difference exists among groups in terms of 
predictor variables. 


e To develop the procedure for assigning new objects, firms or individuals 
whose profile but not the group identity are known to one of the two 
groups. 


e To evaluate the accuracy of classification, i.e., the percentage of 
customers that it is able to classify correctly. 


Discriminant analysis can be a very powerful technique of analysis in 
multiple situations. Some areas in which it is extensively used are as follows: 


e Scale construction: Discriminant analysis is used to identify the 
variables/statements that are discriminating and on which people 
with diverse views will respond differently. For example, in case 
one wants to assess people who believe that corporate governance is 


the responsibility of policy-makers against those who think it needs 
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to be self driven or individual centric, one may generate a number 
of statements and then conduct a pilot study and select only those 
statements on which the two groups differ significantly. 


e Segment discrimination: Most business managers recognize that 
the population under consideration can never be totally homogeneous 
in composition. Therefore, to understand what the key variables are 
on which two or more groups differ from each other, this technique 
is extremely useful. Questions to which one may seek answers are as 
follows: 


o Whatare the demographic variables on which potentially successful 
salesmen and potentially unsuccessful salesmen differ? 


o What are the variables on which users/non-users of a product can 
be differentiated? 


o What are the economic and psychographic variables on which price- 
sensitive and non-price sensitive customers be differentiated? 


o What are the variables on which the buyers of local/national brand 
of a product be differentiated? 


Perceptual mapping: The technique is also used extensively to create 
attribute-based spatial maps of the respondent’s mental positioning of 
brands. The advantage of the technique is that it can present brands 
or objects and the attributes on the same map. Therefore, the business 
manager can determine what attribute is the unique selling proposition 
(USP) of which brand and which are the attributes that are valued by 
the respondent but there is no brand that currently satisfies that need. 


Discriminant Analysis Model 


The mathematical form of the discriminant analysis model is: 
Y =b, +b, X, +b, X, +b, X, +... +b, Xg 


where, Y= Dependent variable 


b = Coefficients of independent variables 
X = Predictor or independent variables 


It may be kept in mind that the dependent variable Y should be a 
categorized variable, whereas the independent variables X_ should be 
continuous. As the dependent variable is a categorized variable, it should be 
coded as 0, 1 or 1, 2 and 3, similar to the dummy variable coding. 


The method of estimating bs is based on the principle that the ratio of 
between group sum of squares to within group sum of squares be maximized. 
This will make the groups differ as much as possible on the values of the 
discriminant function. 


After having estimated the model, the bs coefficients (also called Overview of 
discriminant coefficient) are used to calculate Y, the discriminant score by i lal 
substituting the values of X, in the estimated discriminant model. For any 
new data point that we want to classify into one of the groups, a decision 
rule is formulated for this purpose to determine the cut-off score, which is 
usually the midpoint of the mean discriminant scores of the two groups in 
case of two-group discriminant analysis, provided the size of the samples 
in the two groups are same. The accuracy of classification is determined by 
using a classification matrix (also called confusion matrix). 


NOTES 


The relative importance of the independent variables could be 
determined from the standardized discriminant function coefficient and the 
structure matrix. The difference between the standardized and unstandardized 
discriminant function is that in the un-standardized discriminant function 
we have a constant term, whereas in the standardized discriminant function, 
there is no constant term. 


13.4 THE PROCESS OF INTERPRETATION OF TEST 
RESULTS 


A lot of statistical information is available in today’s global and economic 
environment, which can be used successfully only if decision-makers are able 
to not only understand but also interpret the information, and use it effectively. 


The steps involved in statistical data analysis are: 

Step 1— Defining the problem 

To obtain correct data it is essential to define the problem accurately. 
Step 2 — Collecting data 


The next step is to design ways for data collection. You could collect data 
from the entire population, that is a set of all elements of interest in a study 
or you could collect from a sample of the population, that is, a subset of the 
population. It can be collected through observational or experimental studies 
or from existing sources. The data could either be cross-sectional, that is 
collected at the same or approximately the same point in time or time series, 
that is, collected over several time periods. Data could be qualitative, that is, 
labels or names used to identify an attribute of each element. Quantitative data, 
on the other hand, could be numeric data indicative of numbers or volumes. 


Step 3— Analysing the data 


The data collected can be analysed using exploratory methods or confirmatory 
methods. While exploratory techniques try to find out what the data is trying 
to say using simple maths or illustrations, confirmatory methods use ideas 


(from probability theory) to try an answer specific questions. Self-Instructional 
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Step 4 — Reporting the results 


The results of the analysis can be presented in the form of a graph, a table, 
a pie chart or a set of percentages, in case the sample is small. In case the 
analysis involves an entire population, the report will also have to be detailed. 


13.4.1 Guidelines for Making Valid Interpretation 


Validation is the process by which evidence is gathered to come up with 
a reliable and sound scientific basis to help interpret the results/scores as 
proposed by the test developer or user. Validation, starts with a framework that 
defines the scope and aspects (in the case of multi-dimensional scales) of the 
proposed interpretation. The framework also includes a rational justification 
that links the interpretation to the test. 


The next step is to list a series of propositions to be fulfilled for the 
interpretation to be valid. The other option is to compile a list of issues that 
may adversely affect the validity of the interpretations. In either case evidence 
needs to be gathered through original or empirical research or through meta- 
analysis or review of existing literature, or logical analysis of the issues. This 
supports or questions the interpretation’s propositions or threats to its validity. 
The focus is on quality, rather than quantity, of the evidence. 


A single interpretation of any test may require many propositions to be 
true. Even a strong evidence supporting a single proposition will not lessen 
the need to support the other propositions. 


Evidence to support (or question) the validity of an interpretation can 
be classified into one of the following categories based on: 


1. Test content 


: Response processes 


2 

3. Internal structure 

4. Relations to other variables 
5 


. Consequences of testing 


Methods to collect each type of evidence should only be used when 
they result in information that would either back or challenge the propositions 
needed to interpret. Each evidence is then absorbed into a validity argument, 
which may require the test to be revised, or the administration protocol of 
the test to be modified or the theories or concepts forming the base of the 
interpretations to be revised. If any of this is done in any way, a new validation 
process will require to collect evidence, which will support the revised or 
new version. 


Qualitative Research and Validity Overview of 


Non-Parametric Tests 


In quantitative research validity refers to the following: 


e Internal validity (dependent on the strength of the relation between 
cause and effect) NOTES 


e External validity (indicating the possibility to generalize findings). 


In qualitative research, validity is a rather complex issue and it is not 
possible to apply traditional standards easily. If the knowledge is subjective 
or if there is a single inaccessible truth, the validity criteria can only be very 
generic or/and subjective. Lincoln and Guba (1985) suggested empiricist 
criteria for qualitative research, as follows: 


e Credibility or internal validity 

e Transferability or external validity 
e Dependability 

e Confirmability 


Later the concepts of authenticity and morality were brought in (Angen 
2000). 


Check Your Progress 


1. List one assumption in the ANOVA technique. 
2. What is factor analysis? 

3. What is the first step of factor analysis? 

4. What is validation? 


13.5 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. One of the assumptions used in the ANOVA technique is that all the 
involved populations from where the samples are taken are normally 
distributed. 


2. Factor analysis is a multivariate statistical technique in which there is 
no distinction between dependent and independent variables. 


3. The first and the foremost step in factor analysis is to decide on how 
many factors are to be extracted from the given set of data. 


4. Validation is the process by which evidence is gathered to come up 
with a reliable and sound scientific basis to help interpret the results/ 
scores as proposed by the test developer or user. 
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13.7 


SUMMARY 


There are situations where assumptions cannot be made. In such 
situations, different statistical methods are used, which are known as 
non-parametric tests. 


The Mann-Whitney test is used to examine whether two samples have 
been drawn from populations with same locations (mean). 


The Kruskal-Wallis test is an extension of the Mann-Whitney U 
test discussed earlier. Both methods require that the scale of the 
measurement of a sample value should be at least ordinal. 


Factor analysis is a multivariate statistical technique in which there is 
no distinction between dependent and independent variables. In factor 
analysis, all variables under investigation are analysed together to 
extract the underlined factors. 


Cluster analysis is a grouping technique. The basic assumption 
underlying the technique is the fact that similarity is based on multiple 
variables, and the technique attempts to measure the proximity in terms 
of the study variables. 


Discriminant analysis is used to predict group membership. This 
technique is used to classify individuals/objects into one of the 
alternative groups on the basis of a set of predictor variables. 


The dependent variable in discriminant analysis is categorical and on 
a nominal scale, whereas the independent or predictor variables are 
either interval or ratio scale in nature. 


Validation, starts with a framework that defines the scope and aspects 
(in the case of multi-dimensional scales) of the proposed interpretation. 
The framework also includes a rational justification that links the 
interpretation to the test. 


KEY WORDS 


Factor Analysis: It is a process in which the values of observed data 
are expressed as functions of a number of possible causes in order to 
find which are the most important. 


Cluster Analysis: It is the task of grouping a set of objects in such a 
way that objects in the same group are more similar to each other than 
to those in other groups 


Perceptual Mapping: It is a diagrammatic technique used by asset 
marketers that attempts to visually display the perceptions of customers 
or potential customers. 


Overview of 


13.8 SELF ASSESSMENT QUESTIONS AND Non-Parametric Tests 
EXERCISES 


Short-Answer Questions NOTES 


1. Discuss the Mann-Whitney test. 


2. List the testing procedure of the Wilcoxon matched-pair signed rank 
test. 


3. Write a short-note on the uses of factor analysis. 


4. List the steps in statistical data analysis. 
Long-Answer Questions 


1. Illustrate the Kruskal-Wallis test. 
2. Examine the factor analysis exercise in detail. 
3. Describe the use of cluster analysis. 


4. Explain the discriminant analysis model in detail. 
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UNIT 14 REPORT WRITING 


Structure 
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14.5 Report Writing: Principles, Features and Criteria 
14.5.1 Principles of a Good Report Writing 
14.5.2 Features of a Good Research Report 
14.5.3 Criteria for Evaluating Research Reports/Research Findings 
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14.6.2 References and Annotations 
14.7 Data Support and Diagrammatic Elucidation 
14.8 Answers to Check Your Progress Questions 
14.9 Summary 
14.10 Key Words 
14.11 Self Assessment Questions and Exercises 
14.12 Further Readings 


14.0 INTRODUCTION 


In this unit, you will learn about the various aspects of research report 
writing. On completion of the research study and after obtaining the research 
results, the real skill of the researcher lies in analysing and interpreting the 
findings and linking them with the propositions formulated in the form of 
research hypotheses at the beginning of the study. The statistical or qualitative 
summary of results would be little more than numbers or conclusions unless 
the researcher is able to present the documented version (research report) of 
the research endeavour. 


Thus, one cannot overemphasize the significance of a well-documented 
and structured research report. Just like all the other steps in the research 
process, this requires careful and sequential treatment. In this unit, we will 
be discussing in detail the documentation of the research study. The format 
and the steps might be moderately adjusted and altered based on the reader’s 
requirement. Thus, it might be for an academic and theoretical purpose or 
might need to be clearly spelt and linked with the business manager’s decision 
dilemma. 


Report Writing 


14.1 OBJECTIVES 


After going through this unit, you will be able to: 
e Discuss the role and types of reports NOTES 
e Identify the steps involved in drafting reports 
e Describe the various contents of a research report 
e Explain the principles, criteria and features of a good research report 


e Discuss the methods to be followed in diagrammatic elucidation of 
data in research reports 


e Analyse how clarity and brevity of expressions enhances the quality 
of research reports 


14.2 ROLE AND TYPES OF RESEARCH REPORTS 


Research reports are designed in order to convey and record the information 
that will be of practical use to the reader. It is organized into distinct units 
of specific and highly visible information. The role of research reports may 
be summarized as follows: 


e The research report fulfils the historical task of serving as a concrete 
proof of the study that was undertaken. This serves the purpose of 
providing a framework for any work that can be conducted in the same 
or related areas. 


It is the complete detailed report of the research study undertaken by 
the researcher, thus it needs to be presented in a comprehensive and 
objective manner. This is a one-way communication of the researcher’s 
study and analysis to the reader/manager, and thus needs to be all- 
inclusive and yet neutral in its reporting. 


e For academic purpose, the recorded document presents a knowledge 
base on the topic under study and for the business manager seeking help 
in taking more informed decisions, the report provides the necessary 
guidance for taking appropriate action. 


As the report documents all the steps followed and the analysis carried 
out, it also serves to authenticate the quality of the work carried out 
and establishes the strength of the findings obtained. 


Thus, effective recording and communicating of the results of the 
study becomes an extremely critical step of the research process. Based on 
the nature of the research study and the researcher’s orientation, the report 
can take different forms. 
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Types of Research Reports 


Research reports can be categorized on the following bases: 
1. On the basis of size 
2. On the basis of information 


3. On the basis of representation 
1. Classification on the basis of size 


Based on the size of the report, it is possible to divide the report into brief 
reports and detailed reports. 


e Brief reports: These kinds of reports are not formally structured and 
are generally short, sometimes not running more than four to five 
pages. The information provided has limited scope and is a prelude 
to the formal structured report that would subsequently follow. These 
reports could be designed in several ways. 


o Working papers or basic reports are written for the purpose of 
recording the process carried out in terms of scope and framework 
of the study, the methodology followed and instrument designed. 
The results and findings would also be recorded here. However, 
the interpretation of the findings and study background might be 
missing, as the focus is more on the present study rather than past 
literature. 


o Survey reports might or might not have an academic orientation. 
The focus here is to present findings in easy-to-comprehend format 
that includes figures and tables. The advantage of these reports is 
that they are simple and easy to understand and present the findings 
in a clear and usable format. 


e Detailed reports: These are more formal and could be academic, 
technical or business reports. 


Sometimes, the researcher may prepare both kinds— for an individual 
as well as for a business purpose. 


2. Classification on the basis of information 


The ways through which the results of the research report can be presented 
on the basis of information contained as follows: 


e Technical report: A technical report is not written by the researcher 
himself but is written on behalf of other researchers. In writing technical 
reports, importance is mainly given to the methods that have been used 
to collect the information and the data, the presumptions that were 
made and finally, the various presentation techniques that were used 


to present the findings and the data. Following are the main features 
of a technical report: 


o Summary: It covers a brief analysis of the findings of the research 
in a few pages. 


o Nature: It contains the reasons for which the research is undertaken, 
the analysis and the data that is required in order to prepare the report. 


o Methods employed: It contains a description of the methods that 
were employed in order to collect data. 


o Data: It covers a brief analysis of the various sources from which 
the data was collected with their features and drawbacks. 


o Analysis of data and presentation of the findings: It contains the 
various forms in which the data that has been analysed and can be 
presented. 


o Conclusions: It contains a brief explanation of the findings of the 
research. 


o Bibliography: It contains a detailed analysis of the various 
bibliographies that have been used in order to conduct the research. 


o Technical appendices: It contains the appendices for the technical 
matters and for questionnaires and mathematical derivations. 


o Index: The index of the technical report must be provided at the 
end of the report. 


Popular report: A popular report is formulated when there is a need 
to draw the conclusions of the findings of the research report. One of 
the main considerations that should be kept in mind while formulating 
a research report is that it must be simple and attractive. It must be 
written in a very simple manner that can be is understood all, and also 
be made attractive by using large prints, various sub-headings and by 
giving cartoons occasionally. The following are the main points that 
must be kept in mind while preparing a popular report: 
o Findings and their implications: While preparing a popular 
report, importance is given to the findings of the information and 
the conclusions that can be drawn out of these findings. 


o Recommendations for action: If there are any deviations in the 
report then recommendations are made for taking corrective action 
in order to rectify the errors. 


o Objective of the study: In a popular report, the specific objective 
for which the research has been undertaken is presented. 


o Methods employed: The report must contain the various methods 
that have been employed in order to conduct a research. 
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o Results: The results of the research findings must be presented in 
a suitable and appropriate manner by taking the help of charts and 
diagrams. 


o Technical appendices: The report must contain an in-depth 
information used to collect the data in the form of appendices. 


Technical reports: These are major documents and would include 
all elements of the basic report, as well as the interpretations and 
conclusions, as related to the obtained results. This would have a 
complete problem background and any additional past data/records 
that are essential for understanding and interpreting the study results. 
All sources of data, sampling plan, data collection instrument(s), data 
analysis outputs would be formally and sequentially documented. 


Business reports: These reports include conclusions as understood 
by the business manager. The tables, figures and numbers of the first 
report would now be pictorially shown as bar charts and graphs and 
the reporting tone would be more in business terms. Tabular data might 
be attached in the appendix. 


3. Classification on the basis of representation 


Following are the ways through which the results of the research report can 
be classified on the basis of representation: 


Written report: A written report plays a vital role in every business 
operation. The manner in which an organization writes business letters 
and business reports creates an impression about its standard. Therefore, 
the organization should emphasize on the improvement of writing skills 
of the employees in order to maintain effective relations with their 
customers. Making an effective written report requires a lot of hard 
work. Therefore, before you begin writing, it is important to know the 
objective, i.e., the purpose of writing, collection and organization of 
required data. 


Oral report: At times, oral presentation of the results that are drawn out 
of research is considered effective, particularly in cases where policy 
recommendations are to be made. This approach proves beneficial 
because it provides a medium of interaction between the listeners and 
the speakers. This leads to a better understanding of the findings and 
their implications. However, the main drawback of oral presentation is 
lack of any permanent records related to the research. Oral presentation 
of a report is more effective when it is supported by various visual 
devices such as slides, wall charts and white boards that help in better 
understanding of the research reports. 


Report Writing 


14.3 STEPS INVOLVED IN DRAFTING RESEARCH 
REPORTS 


Whatever the type of report, the reporting and dissemination of the study NOTES 
and its findings require a structured format and by and large, the process is 
standardized. As stated above, the major difference amongst the types of 
reports is that all the elements that essentially constitute a research report 
would be present only in a detailed technical report. In the management 
report, the information on the sampling techniques follows the research 
intention, and the questionnaire design details need not be reported. The 
review of past literature would be perfunctory in the management report; 
however, they would be detailed and accompanied with the bibliography 
in the technical report. Usage of theoretical and technical jargon would be 
higher in the technical report and visual presentation of data would be higher 
in the management report. 


The process of report formulation and presentation is presented 
in Figure 14.1. As can be observed, the preliminary section includes the 
rudimentary parts, for example the title page, followed by the letter of 
authorization, acknowledgements, executive summary and the table of 
contents. Then come the background section, which includes the problem 
statement, introduction, study background, scope and objectives of the study 
and the review of literature (depends on the purpose). This is followed by the 
methodology section, which, as stated earlier, is again specific to the technical 
report. This is followed by the findings section and then come the conclusions. 
The technical report would have a detailed bibliography at the end. 


In the management report, the sequencing of the report might be 
reversed to suit the needs of the decision-maker, as here the reader needs 
to review and absorb the findings. Thus, instead of simply summarizing the 
statistical results, the findings need to be presented in such a way that they can 
be used directly as inputs for decision-making. Thus, the last section would 
be presented immediately after the study objectives and a short reporting on 
methodology could be presented in the appendix. 


Thus, the entire research project needs to be recorded either as a single 
written report or into several reports, depending on the need of the readers. 
The researcher would need to assist the business manager in deciphering the 
report, executing the findings, and in case of need, to revise the report to suit 
the specific actionable requirements of the manager. 


Thus, research reports are the product of slow, painstaking, accurate 
inductive work. The usual steps involved in writing a research report are as 
follows: 
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Report Writing Step 1: Logical analysis of the subject-matter 
Step 2: Preparation of the final outline 
Step 3: Preparation of the rough draft 
NOTES Step 4: Rewriting and polishing 
Step 5: Preparation of the final bibliography 
Step 6: Writing the final draft 


14.4 CONTENTS OF RESEARCH REPORT 


As presented in Figure 14.1, most research reports include the following 
sections: 


Preliminary Section 

* Title Page 

« Letter of Authorization 
+ Executive Summary 

+ Acknowledgements 

« Table of Contents 


Background Section 

« Problem Statement 

« Study Introduction and Background 
+ Scope and Objectives of the Study 
« Review of Literature 


Methodology Section 
« Research Design 

« Sampling Design 

« Data Collection 

e Data Analysis 


Findings Section 
+ Results 
« Interpretation of Results 


Conclusions Section 
* Conclusion and Recommendations 
« Limitations of the Study 


Fig. 14.1 The Process of Report Formulation and Writing 
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Preliminary section Report Writing 


This section mainly consists of identification information for the study 
conducted. It has the following individual elements: 


Title page: This includes classification data about: NOTES 
e The target audience, or the intended reader of the report 


e The report author(s), including their name, affiliation and address. 

e The title of the study presented in a manner to clearly indicate the 
study variables; the relationship or status of the variables studied and 
the population to which the results apply. The title should be crisp and 
indicative of the nature of the project, as illustrated in the following 
examples. 


o Comparative analysis of BPO workers and schoolteachers with 
reference to their work-life balance 


o Segmentation analysis of luxury apartment buyers in the National 
Capital Region (NCR). 

o Anassessment of behavioural factors impacting consumer financial 
investment decisions. 


Letter of transmittal: This is the letter that goes alongside the formalized 
copy of the final report. It broadly refers to the purpose behind the study. 
The tone in this note can be slightly informal and indicative of the rapport 
between the client-reader and the researcher. The letter broadly refers to 
three issues.It indicates the term of the study or objectives; next it goes on 
to broadly give an indication of the process carried out to conduct the study 
and the implications of the findings. The conclusions generally are indicative 
of the researcher’s interest/learning from the study and in some cases may 
be laying the foundation for future research opportunities. 


Letter of authorization: Sometimes the letter of authorization may be 
redundant as indications of the formal approval for conducting the study might 
be included in the letter of transmittal. The author of this letter is the business 
manager or corporate representative who formally gives the permission for 
executing the project. The tone of this letter, unlike the above document, is 
very precise and formal, leaving no room for speculation or interpretation. 


As explained, this letter is not critical to submission, in case reference 
to the same has been made in the transmittal letter. However, in case it is to 
be included in the report, it is advisable to reproduce the exact prototype of 
the original letter. 


Table of contents: All reports should have a section that clearly indicates 
the division of the report based on the formal areas of the study as indicated 
in the research structure. The major divisions and subdivisions of the study, 
along with their starting page numbers, should be presented. The subheadings 
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and the smaller sections of a topic need not be indicated here as then the 
presentation of the content seems cluttered. 


Once the major sections of the report are listed, the list of tables come 
next, followed by the list of figures and graphs, exhibits (if any) and finally 
the list of appendices. 


Executive summary: This is the last and the most critical element of the 
preliminary section. The summary of the entire report, starting from the 
scope and objectives of the study to the methodology employed and the 
results obtained, have to be presented in a brief and concise manner. In case 
the research requirement was to provide recommended changes based on the 
findings, it is advisable to provide short pointers here. Interestingly, it has 
been observed that in most instances the business managers read only the 
executive summary in its complete detail and most often just glance through 
the rest of the report. Thus, it becomes extremely critical to present a Gestaltan 
view of the entire report in a suitable condensed form. 


The executive summary essentially can be divided into four or five 
sections. It begins with the study background, scope and objectives of 
the study, followed by the execution, including the sample details and 
methodology of the study. Next comes the findings and results obtained. The 
fourth section covers the conclusions which are more or less based on the 
opinion of the researcher. Finally, as stated earlier, in case the study objectives 
necessitates implications, the last section would include recommendations 
and suggestions. 


Acknowledgements: A small note acknowledging the contribution of 
the respondents, the corporates and the experts who provided inputs for 
accomplishing the study is to be included here. 


Though the executive summary comes before the main body of the 
report, it is always prepared after the entire report has been finalized and is 
ready in its final form. The length of this section is one or two pages only and 
the researcher needs to effectively present the most significant parts of the 
study in a succinct form. It has been observed that the executive summary is 
a standalone document that is often circulated independently to the interested 
managers who might be directly or indirectly related to the study. 


Main report 
This is the most significant and academically robust part of the report. The 
sections of this division follow the essential pattern of a typical research study. 


Problem definition: This section begins with the formal definition of the 
research problem. The problem statement is the research intention and is more 
or less similar to what was stated earlier as the title of the research study. 


Study background: Study background presents details of the preliminary 
conceptualization of the management decision problem and all the groundwork 
done in terms of secondary data analysis, industry experts’ perspectives and 
any other earlier reporting of similar approaches undertaken. Thus, essentially, 
the section begins by presenting the decision-makers’ problem and then moves 
on to a description of the theoretical and contemporary market data that laid 
the foundation that guided the research. 


In case the study is an academic research, there is a separate section 
devoted to the review of related literature, which presents a detailed reporting 
of work done on the same or related topic of interest. 


Study scope and objectives: The logical arguments then conclude in the form 
of definite statements related to the purpose of the study. A clear definition 
of the scope and objective of the study is presented usually after the study 
background; in case the study is causal in nature, the formulated hypotheses 
are presented here as well. 


Methodology of research: This section would not be sequentially placed 
here, for short reports or for a business report. In such reports, a short 
description of the methodology followed would be documented in the 
appendix. However, for a technical and academic report, this is a significant 
and primary contribution of the research study. The section would essentially 
have five to six sections specifying the details of how the research was 
conducted. These would essentially be: 


e Research framework or design: The variables and concepts being 
investigated are clearly defined, with a clear reference to the 
relationship being studied. The justification for using a particular design 
has to be presented in a sequential and step-wise manner enlisting the 
experimental and control conditions, in case of a causal study. The 
researcher must take care to keep the technical details of the execution 
in the appendix and present the execution details in simple language, 
in the main body. 


Sampling design: The entire sampling plan in terms of the population 
being studied, along with the reasons for collecting the study-related 
information from the given group is given here. The execution details, 
in terms of sample size calculations, sampling frame considered and 
field work details can be recorded in the appendix rather than in the 
main body of the report. However, the sample profile and identification 
details are included in the main section. As stated earlier, the report 
needs to be reader-friendly, and too much technical information might 
not be required by the decision-maker. 


e Data collection methods: In this section, the researcher should clearly 
list the information needed for the study as drawn from the study 
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objectives stated earlier. The secondary data sources considered and the 
primary instrument designed for the specific study are discussed here. 
However, the final draft of the measuring instrument can be included 
in the appendix, which includes the execution details in terms of how 
the information was collected; how the open ended or opinion-based 
questions were handled; and how irregularities were handled and 
accounted for in the study. These and similar information enable a 
clear insight into the standardization of procedures maintained. 


Data analysis: Here, the researcher again needs to revisit the research 
objectives and the study design in order to justify the analytical tools 
and techniques used in the study. The assumptions and constraints 
of the analysis need to be explained here in simple, non-technical 
terms. There is no need to give a detailed description of the statistical 
calculations here. 


Study results and findings: This is the most critical chapter of the report 
and requires special care; it is probably also one of the longest chapters 
in the document. The researcher could, thus, consider either breaking 
this into subchapters or at least clear subheadings. 


Researchers commonly divide the chapter on the basis of the data 
collection plan, i.e., there is a section on interview analysis, another one on 
focus group discussion and the third referring to the questionnaire analysis. 
This, however, does not serve any purpose as the results would then seem 
repetitive and disjointed. Instead, the result should be organized according to 
the information areas on which the data was collected or on the basis of the 
research objectives. There are also times when the data would be presented for 
the whole sample and then will be split and presented for the sub-population 
studied. For example, in the study on work-life balance, the findings were 
presented for the whole sample and then at the micro level for the BPO 
sector and separately for the school teacher segment. For each group, first the 
sample profile in terms of the demographic details of age, education, income 
(individual and family), years of experience, marital status, family size and 
other details was presented. Next, the descriptive data was made available 
on the seven sub-scales studied—and lastly—the predictive data—based on a 
multiple regression analysis with work-life balance as the dependent variable 
and the seven variables as independent, was presented. There was only one 
open-ended question related to the individual’s suggestion as to what support 
was required from one’s place of work to achieve work-life balance. This was 
presented last in the form of a bar chart showing variability in the responses 
given. Again as advised earlier, it is essential to present the findings in the 
form of simplified tables, graphs and figures, with the same being explained 
in simple text subsequently. 


Interpretations of results and suggested recommendations 


The section study results and findings, i.e., the main report, presents a bird’s 
eye view of the information as it exists in a summarized and numerical form. 
This kind of information might become difficult to understand and convert 
into actionable steps, thus the real skill of the researcher lies in simplifying 
the data in a reader-friendly language. Here, it is recommended that this 
section should be more analytical and opinion based. The results could 
be supported by the data that was presented earlier, for example, industry 
forecasts or the expert opinion. In case the report had an earlier section on 
literature review, the researcher could demonstrate the similarity of findings 
with past studies done on the topic. For example, in a study conducted on 
analysing the antecedents of turnover intention, the results obtained were 
explained as follows: 

The results of the logit regression indicate that organizational 

commitment, age and martial status are significant at 5 per cent and 10 

per cent levels respectively. The results indicate that as organizational 

commitment increases, the log of odd ratios in the favour of high 

turnover intention reduces, which is very logical. This is in accordance 

with the results obtained by Mobley, et al. (1978), Cotton and Tuttle 

(1986), Igbaria and Greenhaus (1992), Ahuja, et al. (2007). Thus, when 

employees feel committed to an organization, they are more likely to 

stay with the organization. 


Sometimes, the research results obtained may not be in the direction 
as found by earlier researchers. Here, the skill of the researcher in justifying 
the obtained direction is based on his/her individual opinion and expertise 
in the area of study. For example, in the same study on turnover intentions, 
contrary findings were explained as follows. 

aaiue the results indicate that the log of odd ratios in favour of high 
turnover intention is more in the case of older respondents; this is 
contrary to the findings of Zeffane and Gul (1995) and Finegold, 
et al. (2002). However, this has to be understood in the light of the 
profession, as in India, most people take the BPO sector as a stop-gap 
career and use the time at the BPO employment as an opportunity to 
enhance their academic qualification and then move on, which is also 
one of the reasons why this sector is a young sector. 


Subsequent to the subsection on the interpretation ofresults, sometimes, 
the study requirement might be to formulate indicative recommendations to the 
decision-makers as well. Thus, in case the report includes recommendations, 
they should be realistic, workable and topically related to the industry studied. 
For example, to the business manager of organic food products, the following 
recommendation was made to build awareness amongst potential customers 
about the benefits of organic products: 
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Organic food study: An illustration: The power of the print media in 
promoting a high-involvement product is unsurpassed. Thus, articles by 
leading nutritionists and doctors (88 per cent of consumers are influenced by 
others in consuming health alternatives) on any aspect of organic food would 
work well. The organic players need to take care that they do not advertise 
only their product offerings and price alone but they also need to educate 
the consumer on the health benefits of the products in their advertisements. 


The article/advertisement could be placed in the Sunday supplements 
of newspapers so that people would read them at leisure. The major decision- 
makers for groceries are women thus magazines like Femina, Health and 
Savvy would be likely choices (the magazines suggested are English 
fortnightlies and have a reader profile similar to our sample profile). This is 
also because the product is a premium and niche product and thus requires 
selective exposure. 


Limitations of the study 


The last in this section is a brief discussion of the problems encountered 
during the study and the constraints in terms of time, financial or human 
resources. There could also have been constraints in obtaining the required 
information, either because the data about the topic of interest has not been 
collected or because it is not readily available to all. These clear revelations 
about the drawbacks are thus kept in mind by the reader when analysing the 
results and the implications of the study. 


End notes 


The final section of the report provides all the supportive material in the 
study. Some of the common details presented in this section are as follows: 


Appendices: The appendix section follows the main body of the report and 
essentially consists of two kinds of information: 


1. Secondary information like long articles or in case the study uses/is 
based on/refers to some technical information that needs to be understood 
by the reader. Or long tables or articles or legal or policy documents. 


2. Primary data that can be compressed and presented in the main body 
of the report. This includes: Original questionnaire, discussion guides, 
formula used for the study, sample details, original data, long tables 
and graphs which can be described in statement form in the text. 


Bibliography: This is an important part of the final section as it provides the 
complete details of the information sources and papers cited in a standardized 
format. It is recommended to follow the publication manuals from the 
American Psychological Association (APA) or the Harvard method of 
citation for preparing this section. In fact, with the advancement in computer 
technology the Microsoft office Word 2007 can automatically generate a 


bibliography based on any of these formats, based on the source information Report Writing 
provided in the document. 


The reporting content of the bibliography could also be in terms of: 


e Selected bibliography: Selective references are cited in terms of NOTES 
relevance and reader requirement. Thus, the books or journals that 
are technical and not really needed to understand the study outcomes 
are not reported. 


e Complete bibliography: All the items that have been referred to, 
even when not cited in the text, are given here. 


e Annotated bibliography: Along with the complete details of the 
cited work, some brief information about the nature of information 
sought from the article is given. This could run into three or four 
lines or a brief paragraph. 


At this juncture we would like to refer to another method of citation 
that an author might wish to use during report writing. This could be in the 
form of a footnote. To explain the difference we would first like to explain 
what a typical footnote is: 


Footnote: A typical footnote, as the name indicates, is part of the main report 
and comes at the bottom of a page or at the end of the main text. This could 
refer to a source that the author has referred to or it may be an explanation 
of a particular concept referred to in the text. 


The referencing protocol of a footnote and bibliography is different. In 
a footnote, one gives the first name of the person first and the surname next. 
However, this order is reversed in the bibliography. Here we start first with 
the surname and then the first name. In a bibliography, we generally mention 
the page numbers of the article or the total pages in the book. However, in a 
footnote, the specific page from which the information is cited is mentioned. 
A bibliography is generally arranged alphabetically depending on the author’s 
name, but in the footnote the reporting is based on the sequence in which 
they occur in the text. 


Glossary of terms: In case there are specific terms and technical jargon 
used in the report, the researcher should consider putting a glossary in the 
form of a word list of terms used in the study. This section is usually the last 
section of the report. 


Check Your Progress 


1. What are the various bases of categorizing research reports? 
2. When is a popular report formulated? 
3. List the usual steps involved in writing a research report. 


4. What do you understand by the letter of transmittal? 
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14.5 REPORT WRITING: PRINCIPLES, FEATURES 
AND CRITERIA 


An important point to remember in report writing is that the document 
compiled is meant for specific readers. Thus, one needs to design the same 
according to the needs of the reader. Listed below are some features of a 
good research study that should be kept in mind while documenting and 
preparing the report. 


Clear report mandate: While writing the research problem statement and 
study background, the writer needs to be focused, precise and very explicit 
in terms of the problem under study, the background that provided the 
impetus to conduct the research and the study domain. This is prepared on the 
assumption that the writer at no point in time needs to be physically present 
in order to clarify the research mandate. One cannot make an assumption that 
the reader has earlier insights into the problem situation. The writer needs 
to be absolutely clear on the need for lucidity of thought and dissemination 
of this knowledge to the reader. 


Clearly designed methodology: Any research study has its unique orientation 
and scope and thus has a specific and customized research design, sampling 
and data collection plan. The writer, thus, needs to be explicit in terms of 
the logical justification for having used the study methods and techniques. 
However, as stated earlier, the language should be non-technical and reader 
friendly and any technical explanations or details must be provided in the 
appendix. In researches, that are not completely transparent on the set of 
procedures, one cannot be absolutely confident of the findings and resulting 
conclusions. 


Clear representation of findings: The sample size for each analysis, any 
special conditions or data treatment must be clearly mentioned either as a 
footnote or as an endnote, so that the reader takes this into account while 
interpreting and understanding the study results. The sample base is very 
important in justifying a trend or taking a strategic decision; for example, if 
amongst a sample of bachelors we say that 100 per cent young bachelors want 
to buy grocery online or on the telephone and the recommended strategy is 
to suggest this as the delivery channel, one might be making an error if the 
size of the bachelors was four out of a total sample of 100 grocery buyers 
considered. Thus, complete honesty and transparency in stating the treatment 
and editing of missing or contrary data is extremely critical. 


Representativeness of study finding: A good research report is also explicit 
in terms of extent and scope of the results obtained, and in terms of the 
applicability of findings. This is also dependent on whether the assumptions 
and preconditions made for formulating the conclusions and recommendations 
of the study have been explicitly stated. 


In order to ensure that one has been able to achieve the above stated Report Writing 
objective, the reader must ensure a standardization of procedures in writing 
the document as well as follow standard protocols for preparing graphs and 
tables. In the following section we will briefly discuss some simple rules that 
the researcher can use as guidelines for this. NOTES 


Guidelines for Effective Documentation 


The guidelines for an effective documentation may be discussed under the 
following heads: 


Command over the medium: Even though one may have done an extremely 
rigorous and significant research study, the fundamental test still remains as 
to how the learning has been disseminated. Regardless of how effective the 
graphs and figures are in showcasing the findings, the verbal description and 
explanation—in terms of why it was done, how it was done, and what was 
the outcome, still remain the acid test. 


Thus, a correct and effective language of communication is critical in 
putting ideas and objectives in the vernacular of the reader/decision-maker. 
The writer may, thus, be advised to read professionally written reports and, if 
necessary, seek assistance from those proficient in preparing business reports. 


Phrasing protocol: There is a debate about whether or not one makes use 
of personal pronoun while reporting. To understand this, one needs to revisit 
the responsibility of the researcher, which is to present the findings of his/ 
her study, with complete objectivity and precision. The use of personal 
pronoun such as ‘I think.....’ or ‘in my opinion.....’ lends a subjectivity 
and personalization of judgement. Thus, the tone of the reporting should be 
neutral. For example: 


‘Given the nature of the forecasted growth and the opinion of the 
respondents, it is likely that the...... 


Whenever the writer is reproducing the verbatim information from 
another document or comment of an expert or published source, it must 
be in inverted commas or italics and the author or source should be duly 
acknowledged. For example: 

Sarah Churchman, Head of Diversity, PricewaterhouseCoopers, states 
‘At PricewaterhouseCoopers we firmly believe that promoting work— 
life balance is a ‘business-critical’ issue and not simply the ‘right 
thing to do’. Profitable growth and sustainable business depends on 
attracting and retaining top talent and we know, from our own research 
and experience that work-life policies are an essential ingredient of 
successful recruitment and retention strategies.’ 


The writer should avoid long sentences and break up the information 
in clear chunks, so that the reader can process it with ease. Similar is the case 
in structuring of the chapters or sections of the report that can be logically 
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broken down into smaller sections that are comprehensive and complete and 
yet maintain a strong but logical link with the flow of reporting. 


With the onset of the use of abbreviated communications in SMS and 
emails, most people tend to use shortened form as ‘cd.’ for could and ‘u’ for 
you, etc. Also the use of colloquial language and slangs must be avoided, as 
this is a formal document and one must maintain the sanctity of the formal 
documentation required in a research report. 


Simplicity of approach: Along with grammatically and structurally correct 
language, care must be taken to avoid technical jargon as far as possible. The 
business manager, might have been a business student who had prepared a 
research report in his academic pursuits but now understands simple common 
terms and does not have the time or inclination to juggle the dictionary and 
the report together. In case it is imperative to use certain terminology, then, 
as stated earlier, the definition of these terms can be provided in the glossary 
of terms at the end of the report. 


Sometimes the writer may prepare different research reports for the 
same study to suit the need of diverse readers, for example, the business report 
needs to be crisp and simple with definable and workable recommendations. 
On the other hand, an academic report could discuss extensively the literature 
review section, as well as the statistical analysis and interpretation. 


Report formatting and presentation: In terms of paper quality, page margins 
and font style and size, a professional standard should be maintained. The 
font style must be uniform throughout the report. The topics, subtopics, 
headings and subheadings must be construed in the same manner throughout 
the report. Sometimes certain academic reports have a mandated format 
for presentation which the writers need to follow, in which case there is no 
choice in presentation. 


However, when this is not clear, it is advisable that the writer creates 
his/her own formatting rules and saves it on a notepad so that they can be 
implemented in a standardized and professional manner. 


The researcher can provide data relief and variation by adequately 
supplementing the text with graphs and figures. Pictorial representations are 
simple to comprehend and also break the monotony and fatigue of reading. 
They should be used effectively whenever possible in the report. 


14.5.1 Principles of a Good Report Writing 


Based on the above description, it may be concluded that report writing 
should be based on the following principles: 


e Principle of purpose: A report must have a clear and meaningful 
purpose that can be converted into an effective management. A clear 
statement of purpose helps prepare a well-focussed report on which the 
management can work. Specification of purpose is important because: 


o Reports are analyses of facts and proposals. 
o They are records of particular business activities. 


Principles of organization: A written report should be well-designed 
and well-ordered. The managerial plan of a report must include the 
following: 


o Purpose of the report 
Information required to be included in the report 
Method used to collect report data 


o 
o 

o Summary of the report 

o Problems and solutions of the subject mentioned in the report 
o 


An appendix that describes and confirms the content and conclusion 
of the report 


Principles of brevity: Reports should be concise. It is essential because 
long reports: 


o Are costly. 
o Are difficult to examine. 
o Are prone to disapproval, as they seem insufficient. 


o Focus on irrelevant minor details that may lead to the ignorance of 
major points. 

Principles of clarity: Reports should be clear. Clarity can be maintained 

by using simple language for writing the report. New terms, if any in 

the report, should be properly explained to avoid confusion. 


Principle of scheduling: Reports should be prepared at that time when 
there is no undue burden on the staff or when the staff has sufficient 
time to prepare reports. However, the time period between the gathering 
of data and generating finished reports should not be long; otherwise, the 
report may become outdated and useless if it is not completed in time. 


Principle of cost: While preparing reports, it is necessary that their 
cost-benefit analysis should be done. A report should be minimum at 
costs and maximum at benefits. If the cost of preparation of the report 
is high but its benefit is low, then it is not advisable to prepare that report. 


14.5.2 Features of a Good Research Report 


Research reports must be absolutely efficient and well formatted and the 
matter should be clear, analytical and directive. The actual facts need to 
be explained clearly. Data and results should be furnished in graphical or 
tabular format as it would create a substantially good impression and would 
be unambiguous to understand. 
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The characteristic features of a good research may be listed as follows: 

e Information collected in the report should be relevant and focused 
to derive desired results. 

e The report should strictly adhere to predefined goals and objectives. 

e The report should provide the description of the questionnaires used 
in analysis and the means adopted in their preparation. 

e The report should elaborate the methodology used in the interviews. 

e There must be an executive summary of the work in the report. 

e The report should not only present the actual analysis but also the 
reasons of making this report. It should also highlight the advantages 


and profit it can provide after successful implementation of business 
plans described inside the report. 


e It should also mention the methodology of the research presenting 
the overall process adopted to create the report. 


e The report needs to be flexible enough so that it may be changed 
according to requirements. 


14.5.3 Criteria for Evaluating Research Reports/Research Findings 


Research reports/findings are evaluated on the basis of following criteria: 


e Clarity: The report should be clear in terms of representation of data. 
It should be easy to understand. 


e Statement of objective: The objective of the report should be stated 
in the beginning of a report. While evaluating, it is important to check 
whether the research achieved the stated objective. 


e Relevance of data: The data of the report should be relevant to the 
research topic. In addition, it is important to check whether the recent 
data was used in report. 


e Analysis of data: The data should be properly analyzed. Thus, the 
evaluator checks whether all the findings are supported by analysis. 


e Unbiased: The report should not be biased towards a particular 
interpretation because biasness affects the complete process of research. 


14.6 RESEARCH REPORT: LANGUAGE FLOW AND 
GRAMMATICAL QUALITY 


A report must have a clear and logical structure with clear indication of where 
the ideas are leading. It should be able to make a good first impression. The 
presentation of the report is very important. All reports must be written in 
a good language, using short sentences and correct grammar and spellings. 
The main points to be kept in mind in this light are as follows: 


e Context and style: Report Writing 
o Appropriate, informative title for the report 
o Crisp, specific, unbiased writing with minimal jargon 
o Adequate analysis of prior relevant research NOTES 
e Questions/hypotheses: 
o Clearly stated questions or hypotheses 


o Thorough operational definitions of key concepts along with the 
exact wording or measurement of the key variables 


e Research procedures: 
o Full and clear description of the research design 
o Demographic profile of the participants/subjects 
o Specific data gathering procedures 

e Data analysis: 


o Appropriate inferential statistics for sample or experimental data 
and appropriate use of descriptive statistics 


o Clear and reasonable interpretation of the statistical findings, 
accompanied by effective tables and figures 


e Summary: 
o Fair assessment of the implications and limitations of the findings 


o Effective commentary on the overall implications of the findings 
for theory and/or policy 


14.6.1 Clarity and Brevity of Expressions 


There is a famous saying that ‘words are like mirror that reflect the personality 
of the person from whose mouth they come out’. Thus, if a research wants to 
absorb the attention of the reader, then it must have clarity in its expression 
as this will also tell a lot about the clarity of the researcher’s thought. Experts 
emphasize the importance of using as few words as possible to deliver your 
message. However, sometimes messages that are very brief sacrifice clarity 
and leave out vital information. Thus, while crafting his report, the researcher 
should choose clarity over brevity, and include all relevant information and 
be sure it is logically organized. 


Three rules for bringing clarity in reports 


In business writing, you get points for clarity, not style. Instead of trying to 
wax poetic about your division’s plans for the next 60 days, just make your 
point. Here are three ways to do that: 
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(i) Limit one idea to one paragraph: The researcher should limit his 
thoughts to one per paragraph. When he has another suggestion, thought 
or idea, he should start a new paragraph. 


(ii) Make it scannable: The report should be so prepared that audience is 
able quickly scan the researcher’s message and understand his point. 


(iii) Put your point in the first sentence. The researcher should not entice 
his readers with background information and build-up. He should make 
his primary point first. Then he should go into supporting detail. 


14.6.2 References and Annotations 


Several report types like scientific, engineering, technical and census reports 
contain either original writing or text adopted from a previous work. As 
such, a report writer should be careful and should avoid any violation of 
copyright laws and plagiarism. The necessary rule of thumb in this regard 
can be stated as follows: 


e Citations and referencing: 


o A citation is the acknowledgement in your writing of the work of 
other authors and includes paraphrasing and making direct quotes. 


o Unless citation is very necessary, you should write the material in 
your own words. This shows that you understand what you have 
read and know how to apply it, to your own context. 


o Direct quotes should be used sparingly. 
e Direct quotes: 


o Short direct quotes: These need to be placed between quotation 
marks. For example, Rosenfield defines a cluster as a ‘geographically 
bounded concentration of similar, related or complementary 
businesses, with active channels for business transactions, 
communications and dialogue that share specialized infrastructure, 
common opportunities and threats.’ This shows clearly that the 
words being used are not your own words. 


o Long direct quotes: There are occasions when it is useful to include 
long direct quotes. If you are quoting more than forty words, you 
should again use quotation marks but also indent the text. For 
example: 


The sustainability of higher value-added industry is grounded in the 
diminishing significance of cost structures. At the level of the European 
Union, a weak capacity to innovate has been identified as an innovation, in 
the sense of product, process and organizational innovation, accounts for a 
very large amount, perhaps 80-90 per cent of the growth in productivity in 
advanced economies. 


14.7 DATA SUPPORT AND DIAGRAMMATIC 
ELUCIDATION 


The visual representation of the findings in the form of lines or boxes and 
bars relative to a number line is easy to comprehend and interpret. There are 
some standard rules and procedures available to the researcher for this; also 
there are computer programs like MS Excel and SPSS, where the numbered 
data can be converted with ease into graphical form. 


Line and curve graphs: Usually, when the objective is to demonstrate trends 
and some sort of pattern in the data, a line chart is the best option available 
to the researcher as the line is able to clearly portray any change in pattern 
during a particular time period. On the same chart, it is also possible to show 
patterns of growth of different sectors or industries in the same time period or 
to compare the change in the studied variable across different organizations 
or brands in the same industry. Certain points to be kept in mind while 
formulating line charts include: 


e The time units or the causal variable being studied are to be put on the 
X-axis, or the horizontal axis. 

e If the intention is to compare different series on the same chart, the 
lines should be of different colours or forms (Figure 14.2). 


e Too many lines are not advisable on the same chart as then the data 
becomes too cluttered; an ideal number would be five or less than five 
lines on the chart. 
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Fig. 14.2 Comparative Analysis of Vehicles (including Nano) on Features 
Desired by Consumers 


Source: vytrak.com 
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Report Writing e The researcher also must take care to formulate the zero baseline in 
the chart as other-wise, the data would seem to be misleading. For 
example, in Figure 14.3(a), in case the zero baseline is (as shown in 
the chart) the expected change in the number of hearing aids units to 

NOTES be sold over the time period 2002—03 to 2007-08, it can be accurately 
perceived. However, in Figure 14.3(b), where the zero is at 1,50,000 
units, the rate of growth can be misjudged to be more swift. 
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Fig. 14.3(a) Expected Growth in the Number of Hearing Aids Units to be sold in North 
India (Three Perspectives) 
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Fig. 14.3(b) Expected Growth in the Number of Hearing Aids Units to be sold in North 
India (Three Perspectives) 
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Area or stratum charts: Area charts are like the line charts, usually used to Report Writing 
demonstrate changes in a pattern over a period of time. However, here there 

are multiple lines that are essentially components of the original composite 

data. What is done is that the change in each of the components is individually 

shown on the same chart and each of them is stacked one on top of the other. NOTES 

The areas between the various lines indicate the scale or volume of the relevant 

factors/categories (Figure 14.4). 
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Fig. 14.4 Perception of Nano by Three Psychographic Segments of Two-wheeler Owners 


Pie charts: Another way of demonstrating the area or stratum or sectional 
representation is through the pie charts. The critical difference between a line 
and pie chart is that the pie chart cannot show changes over time. It simply 
shows the cross-section of a single time period. The sections or slices of the 
pie indicate the ratio of that section to the total area of the parameter being 
displayed. There are certain rules that the researcher should keep in mind 
while creating pie charts. 


e The complete data must be shown as a 100 per cent area of the subject 
being graphed. 


e It is a good idea to have the percentages displayed within or above 
the pie rather than in the legend as then it is easier to understand the 
magnitude of the section in comparison to the total. For example, 
Figure 14.5 shows the brand-wise sales in units for the existing brands 
of hearing aids in the North Indian market. 
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Fig. 14.5 Brandwise Sales (units) of Hearing aids in the North India Market (2002—03) 


e Showing changes over time is difficult through a pie chart, as stated 
earlier. However, the change in the components at different time periods 
could be demonstrated as in Figure 14.6, showing share of the car 
market in India in 2009 and the expected market composition of 2015. 
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Fig. 14.6 Current Structure of the Indian car Market (2009) and the Forecasted 
Structure for 2015 


Bar charts and histograms: A very useful representation of quantum or 
magnitude of different objects on the same parameter are bar diagrams. The 
comparative position of objects becomes very clear. The usual practice is to 
formulate vertical bars; however, it is possible to use horizontal bars as well 
if none of the variable is time related [Figure 14.7(a)]. Horizontal bars are 
especially useful when one is showing both positive and negative patterns on 
the same graph [Figure 14.7(b)]. These are called bilateral bar charts and are 
especially useful to highlight the objects or sectors showing a varied pattern 
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on the studied parameter. It is possible to generate bar graphs with relative 
ease with computer programs today and the distance between the bars can 
be extremely precise as compared to those created by hand. 
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Fig. 14.7(a) Bar Chart per Day, Unit Sales (Thousands) at Fast Food Outlets in Mumbai 
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Fig. 14.7(b) Bilateral Bar Chart—the Brand Recall and Brand Purchase Response for 
Pizza Joints in the NCR 


Another variation of the bar chart is the histogram (Figure 14.8) here the 
bars are vertical and the height of each bar reflects the relative or cumulative 
frequency of that particular variable. 
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Fig. 14.8 Histogram (with Normal Curve) Displaying Marks in a Course on Research 
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Pictogram: A pictogram shows graphical representation of data. Pictograms 
are most often used in popular and general read such as in magazines and 
newspapers, as they are eye-catching and easy to comprehend by one and 
all. They are not a very accurate or scientific representation of the actual data 
and, thus, should be used with caution in an academic or technical report. 


Geographic representation: Geographic or regional maps related to 
countries, states, districts, territories can be used as a base to show occurrence 
of the studied variable in various regions or to show comparative analysis 
about major brands or industries or minerals. In case of comparative data, 


the researcher must provide the legend in the displayed map. 


Check Your Progress 


5. Why does a report need a clear statement of purpose? 
6. When should line charts be used? 
7. What is the difference between a line and pie chart? 


8. What is a pictogram? 
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14.8 ANSWERS TO CHECK YOUR PROGRESS 
QUESTIONS 


1. Research reports can be categorized on the following bases: NOTES 
e On the basis of size 
e On the basis of information 
e On the basis of representation 


2. A popular report is formulated when there is a need to draw the 
conclusions of the findings of the research report. One of the main 
considerations to be kept in mind while formulating a research report 
is that it must be simple and attractive. 


3. The usual steps involved in writing a research report are as follows: 
e Logical analysis of the subject-matter 
e Preparation of the final outline 
e Preparation of the rough draft 
e Rewriting and polishing 
e Preparation of the final bibliography 
e Writing the final draft 

4. The letter of transmittal is the letter that goes alongside the formalized 
copy of the final report. It broadly refers to the purpose behind the 
study. The tone in this note can be slightly informal and indicative of 
the rapport between the client-reader and the researcher. 

5. Areport must have a clear and meaningful purpose that can be converted 
into an effective management. A clear statement of purpose helps 
prepare a well-focussed report on which the management can work. 

6. When the objective is to demonstrate trends and some sort of pattern in 
the data, a line chart is the best option available to the researcher as the 
line is able to clearly portray any change in pattern during a particular 
time period. 

7. The critical difference between a line and pie chart is that the pie chart 
cannot show changes over time. It simply shows the cross-section of 
a single time period. 

8. A pictogram shows graphical representation of data. Pictograms are 
most often used in popular and general read such as in magazines and 


newspapers, as they are eye-catching and easy to comprehend by one 
and all. 
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14.9 


SUMMARY 


Once a research project reaches its conclusion, the most important task 
ahead of the researcher is to document the entire work done in the form 
of a well-structured research report. 


There are brief reports which, as the name suggests, are of a shorter 
length and could be in the form of working papers or short survey 
reports. These might be expanded while preparing the detailed report. 


The detailed report may vary in scope and style depending on the 
requirement of the reader for whom it is to be created. These could be 
in the form of highly structured and comprehensive technical reports 
or simpler action-oriented business reports. 


No matter what the orientation is, reports generally follow a standardized 
structure. The entire report can be divided into three main sections—the 
preliminary section, the main body and endnotes. 


The preliminary section typically includes the title page, the table of 
contents and the letter of authorization and the letter of transmittal. The 
most significant section of this part is a short but succinct executive 
summary, which summarizes the main report. 


The main report includes the background of the study, as well as the 
scope, framework and the methodology of the study, including the data 
collection and sampling plan. The section culminates into the most 
important part of the report, the study findings and interpretation of 
these results. 


The last section of the report includes the bibliography and all the 
supportive documents like measuring instrument (questionnaire), the 
sample details and any relevant document that needs to be referred to 
comprehend the report. 


Any well documented report must be clear and explicit in its reporting. 
There must be no ambiguity in either presenting the findings or 
representativeness of the findings. The designed report must be 
formulated, keeping the reader and the researcher’s capabilities in 
mind. 


The author must follow a widely mandated and followed protocol 
for reporting and referencing in the report. The reporting needs to be 
objective and simple rather than complex and opinionated. 


The researcher at times might need to verbally present the research 
study. These presentation sessions need to be brief and crisp, with the 
thrust being more on the methodology and findings. 


e Communicating and presenting the research results is both a skill and Report Writing 
an art and the richness of the research findings needs to be appropriately 
shared with the interested listeners in a manner best suited to their 


individual needs. 
NOTES 


14.7 KEY WORDS 


e Technical Reports: These are major documents and would include 
all elements of the basic report, as well as the interpretations and 
conclusions, as related to the obtained results. 


e Business Reports: These reports include conclusions as understood 
by the business manager. 


e Letter of Transmittal: This is the letter that goes alongside the 
formalized copy of the final report. It broadly refers to the purpose 
behind the study. 


e Bibliography: This is an important part of the final section as it provides 
the complete details of the information sources and papers cited in a 
standardized format. 


e Footnote: A typical footnote, as the name indicates, is part of the main 
report and comes at the bottom of a page or at the end of the main text. 


e Citation: It is the acknowledgement in your writing of the work of 
other authors and includes paraphrasing and making direct quotes. 


14.8 SELF ASSESSMENT QUESTIONS AND 
EXERCISES 


Short-Answer Questions 


1. What are the different bases of classifying research reports? 

2. State the various principles on which a good research report is based. 
3. Identify the various criteria of evaluating research reports/findings. 
4. 


What are the rules for presenting references and annotations in a 
research report? 


5. What is geographic representation? 
Long-Answer Questions 


1. Discuss in detail the steps that a researcher needs to follow to formulate 
a good research report. Do the criteria become different for different 
kinds of reports? Explain with examples. 
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2. What should be the ideal structure of a research report? What are the 
elements of the structure defined by you? 


3. What are the guidelines for effective report writing? Illustrate with 
suitable examples. 


4. ‘Visual representations of results are best understood by a reader, thus 
special care must be taken for this formulation.’ Examine the truth of 
this statement by giving suitable examples. 


5. What are the guidelines a researcher must follow for graphical and 
tabular representation of the research results? Discuss. 
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